UNCLASSIFIED 


AD  NUMBER 


AD876563 


NEW  LIMITATION  CHANGE 
TO 

Approved  for  public  release,  distribution 
unlimited 


FROM 

Distribution  authorized  to  U.S.  Gov't, 
agencies  only;  Administrative/Operational 
Use;  11  AUG  1970.  Other  requests  shall  be 
referred  to  the  Office  of  Naval  Research, 
Attn:  Code  466,  Arlington,  VA  22217. 


AUTHORITY 


ONR  ltr,  29  Aug  1973 


THIS  PAGE  IS  UNCLASSIFIED 


general  dynamics 

St fr me  Boat  Oiutst^n 


Reproduction  In  whole  or  in 
part  is  permitted  for  any  pur¬ 
pose  of  the  United  States  Covern- 
ment, 


PROCESSING  OF  DATA 
FROM  SONAR  SYSTEMS 
Volume  VII 


Each  transmittal  of  I  hi*  document  outside  the  Agencies  of  the 
U.S.  Government  must  have  prior  approval  of  the  Office  of 
Naval  Research  (Co*Je  Dili) 


J 


lids  research  was  sponsored  by  the  Office  of 
Naval  Research  under  Contract  N00014-GR-C- 
0002  (ONR  Contract  Authority  Identification 
Number  Ml-i’SG-OOl-l)  with  General  Dynamics 
Corporation,  Electric  Boat  Division.  The  work 
was  performed  under  subcontract  by  Yale 
Fniversitv. 


/ 


K\:t  mined;  ^  *  -  /d '  / y  • 


Franz  B.  i'uteur 
John  II.  Chang 
Verne  H.  MacDonald 
James  P.  Grav 


.  / 


J.  W.  Herring 
foj;;«  t  Engineer 


Approved: 


an  Woerkom 
MnnAgt  of  Scientific-  Research 


Of 


D  D  C 

fn)rp/7^rr.nn  nmjrv 

f  ■»  «  mljl 

iciiasDinaiy 


i  !  1  7-7  0-0:11 
August  II,  1070 


ABSTRACT 


Volume  VII  deals  with  the  following  topics: 

1.  Optimum  Detector  for  Monisotroplc  Noise 

The  previously  obtained  expression  for  the  optimum  detector  when 
signal  and  noise  are  zero-mean  gaussian  processes,  and  when  the  noise  may 
contain  interference  components  are  analyzed  to  determine  the  detailed 
structure  of  the  detector.  The  detector  turns  out  to  contain  beam  formers 
that  are  aimed  at  the  target  signal  and  each  interference,  the  signals 
from  the  interference  beams  being  passed  through  rather  complex  filters  and 
then  subtracted  from  the  target  signal.  The  complexity  of  the  optimum 
filter  relative  to  conventional  systems  is  examined,  and  it  is  found  that 
the  added  complexity  Is  quite  moderate. 

2.  Adaptive  Array  Processing 

The  optimum  detector  discussed  above  is  most  easily  constructed  by 
using  transversal  filters,  consisting  of  a  tapped  delay  line  and  adjustable 
weights  applied  to  the  taps.  Algorithms  based  on  the  method  of  stochastic 
approximation  for  automatically  adjusting  these  weights  are  considered  in 
this  section  and  conditions  for  convergence  and  rate  of  convergence  under 
several  different  conditions  are  obtained. 

3.  Optimum  Passive  Bearing  Estimation  in  a  Spatially  Coherent 
Noise  Environment 

The  Cramer-Rao  lower  bound  is  computed  for  the  bearing  estimator, 
Bubject  to  the  assumption  that  interference  noise  is  present.  The  results 
are  compared  with  those  obtained  for  a  modified  split-beam  tracker 
employing  simple  interference  nulling. 


Space-Tine  Properties  of  Sonar  Detection  Models 


The  problem  of  optimizing  array  configurations  is  not  a  well-posed 
problem  unless  It  can  be  shown  that  an  optimum  actually  exists.  Many 
commonly  used  aodels  for  sonar  detection  systems  turn  out  to  be  singular 
•o  that  the  optimum  does  not  exist  i.e.  it  is  infinite.  A  rigorous 
examination  of  the  problem  of  model  singularity,  using  measure  theoretic 
considerations  is  undertaken  in  this  section,  and  general  criteria  for 
nonsingularity  of  models  are  developed. 


CONTENTS 


Report 


Title 


Page 


Abstract 


iii 


Foreword 

I  Introduction 

II  The  Optimum  Detector  For  Nonisotropic  Noise 

III  Adaptive  Array  Processing 

IV  Optimum  Passive  Bearing  Estimation  in  a  Spatially 
Coherent  Noise  Environment 

V  Space  Time  Properties  of  Sonar  Detection  Models 

38  The  Optimum  Detector  for  Nonisotropic  Noise 

39  Adaptive  Array  Processors 

40  Optimum  Passive  Bearing  Estimation  in  a  Spatially 
Coherent  Noise  Environment 

41  Space-Time  Properties  of  Sonar  Detection  Models 


vii 

1 

1 

3 

7 

10 

A-l 

B-l 


C-l 

D-l 


I 


V 


FOREWORD 


This  is  the  seventh  in  a  series  of  reports  describing  work  performed  by  Yale  Univer¬ 
sity  under  a  subcontract  with  Electric  Boat  division  of  General  Dynamics,  prime 
contract  number  N 000  1  !  -fis-O  -o:i 92 .  The  Office  of  Naval  Research  is  sponsor  for 
this  contract,  I.CDR  J.  F.  Lyding  is  Project  Officer  for  ONR.  Mr.  J.  W.  Herring 
is  Project  Engineer  for  Electric  Boat  division  under  the  direction  of  Dr.  A.  ,J. 
van  Woerkom,  Manager  of  Scientific  Research. 


vii 


I.  Introduction 


This  report  is  the  first  of  two  volumes  dealing  with  work  completed 
under  contract  8050-31-55001  between  Tale  University  and  the  Electric  Boat 
Company  during  the  period  from  July  1,  1968  to  April  30,  1970.  More 
detailed  discussions  of  the  results  are  contained  in  the  four  progress 
reports  Nos.  38,  39,  A0  and  41,  which  are  expended.  The  companion  volume 
(vol.  VIII  of  this  series)  covers  work  done  during  the  same  time  period 
and  contains  results  submitted  originally  in  progress  reports  No.  42  and  43. 
Three  of  the  topics  contained  in  this  volume  are  continuations  of  work 
covered  in  earlier  reports,  dealing  with  the  effects  of  anisotropy  of  the 
background  noise  field  -  also  referred  to  as  interference  noise.  The  three 
progress  reports  deal  respectively  with  the  form  of  the  optimum  detector, 
with  the  behavior  of  adaptive  detectors,  and  with  bearing  estimation  under 
these  noise  conditions.  The  fourth  topic,  which  deals  with  the  effect  of 
signal  models  and  the  various  possibilities  for  singular  detection  is 
entirely  new  and  represents  a  substantial  departure  from  work  described  in 
previous  reports. 

II.  The  Optimum  Detector  for  Nonisotropic  Noise 

An  expression  for  the  optimum  detector  transfer  function  when  the  noise 
contains  one  or  more  strong  interference  components  was  originally  obtained 
in  Progress  report  No.  33,  which  is  part  of  volume  V  of  this  series.  The 
Implication  of  this  expression  on  the  detailed  structure  of  the  detector  is 
examined  in  Progress  report  No.  38. 

The  results  of  both  reports  are  based  on  the  assumptions  that  the 
signal,  noise,  and  interference  are  all  sample  functions  of  a  zero-mean 
gauesian  random  process,  that  the  interference  consists  of  a  number  of 
isolated  point  sources  and  that  the  noise  is  otherwise  isotropic  and  far- 


field.  Under  these  conditions  the  filter  can  be  shown  to  separate  into  a 
spatial  part  -  essentially  a  set  of  beaa  formers  -  and  a  temporal  part  or 
Eckart  filter.  For  the  case  of  a  single  interference  the  spatlel  part, 
which  is  also  the  significant  part,  takes  the  form: 


H'Um)  - 


• 

K^G^Ciu) 

e  • 

• 

Ywtm  . 

1  +  MK^w) 

where  the  are  the  signal  delays,  the  are  the  interference  delays, 

H  is  the  number  of  hydroph?vses,  K^(ui)  is  the  ratio  of  interference  spectral 
density  to  ambient  noise  density,  and  G10(u)>  is  given  by 


G10(“J 


k-1 


This  result  can  be  interpreted  to  mean  that  the  filter  contains  a  simple 
beamfonner  aimed  at  the  signal  and  a  second  beamfonner  aimed  at  the  inter¬ 
ference,  and  that  the  interference  output  is  subtracted  from  the  signal 
output  after  being  passed  through  a  filter  with  the  transfer  function 
given  by  the  coefficient  of  the  second  bracketed  term  in  the  above 
express ion. 

For  more  than  one  interference  the  result  is  basically  similar  -  a 
beam  is  aimed  at  each  interference  and  the  output  is  subtracted  from  that 
of  the  main  signal  beam  after  passage  through  a  compensating  filter.  The 
complexity  of  these  compensating  filters  increases  with  the  number  of  the 
interferences;  in  fact  even  for  a  single  interference  it  is  such  that 
automatic  design  by  some  sort  of  adaptive  mechanism  would  almost  have  to 
be  used.  This  point  iB  considered  further  below, 

A  major  difficulty  in  the  design  is  that  because  of  the  need  to  form 
several  beams  simultaneously,  beam  steering  must  be  done  by  tapped  delay 


2 


lines.  The  number  of  taps  and  the  tap  spacing  are  largely  a  function  of 
array  resolution,  which  In  turn,  can  be  related  to  array  aperture.  For 
typical  arrays  the  number  of  taps  tends  to  be  very  large;  however  this  Is 
true  even  if  conventional  or  suboptioal  Instrumentations  are  used.  The 
added  complexity  required  in  the  optimal  instrumentation  is,  from  this  point 
of  view,  quite  modest. 

III.  Adaptive  Array  Processing 

The  automatic  design  of  complicated  filter  transfer  functions  of  the 
sort  mentioned  above  can  be  accomplished  fairly  easily  by  means  of  trans¬ 
versal  filters  -  filters  produced  by  feeding  a  signal  into  a  tapped  delay 
line  and  adding  the  weighted  tap  outputs  to  form  the  output.  For  a  delay 
line  having  M  taps  the  output  y(t>  of  such  a  filter  has  the  form 

M 

y(t)  -  l  c±  xCt-tj) 
i-1 

where  xCt)  is  the  input  signal,  and  is  the  weight  applied  to  the  intput 
delayed  by  the  time  If  ^  is  small  for  all  i  ■  1...M  this 

expression  is  a  discrete  approximation  of  a  convolution  integral  in  which 
the  represent  the  impulse  response  of  a  filter  at  time  t^.  Since  each 
can  take  on  any  arbitrary  value,  extremely  complex  filters  are  easily 
synthesized  in  this  way.  It  was  shown  in  progress  report  No.  34  that  the 
adjustment  of  the  subject  to  one  of  several  criteria  of  optimality  is 
easily  accomplished  by  means  of  algorithms  based  on  the  stochastic 
approximation  method  of  Robbins  and  Monro.  Progress  report  No.  39  Is  a 
continuation  and  elaboration  of  the  earlier  report. 

The  basic  assumptions  used  in  the  analysis  are: 

1)  Target,  interference,  and  ambient  noise  are  zero  mean  gaussian  processes 

2)  The  sum  of  interferences,  ambient  noise,  and  local  noise  are  regarded  aa 


3 


the  effective  noise,  which  is  assumed  to  be  statistically  Independent 
of  the  target  signal. 

3)  the  target  signal  component  Sjft)  observed  at  the  output  of  the  i**4 
hydrophone  is  a  linear  time  invariant  transformation  of  d(t),  the 
target-signal  that  would  be  observed  at  the  output  of  an  ideal  isotropic 
hydrophone  located  at  the  origin  of  coordinates.  The  autocorrelation 
function  of  d(t)  is  assumed  to  be  known. 

4)  The  statistics  of  the  noise  field  are  unknown.  It  is  not  known  whether 
interferences  are  present,  or  where  they  are  located. 

5)  The  wave  fronts  of  target  and  interference  are  assumed  to  be  plane  over 
the  dimensions  of  the  receiving  array. 

It  is  assumed  that  the  adaptive  mechanism  is  to  produce  a  filter 
optimized  In  a  given  direction  and  designed  to  suppress  interference  signals 
from  other  directions.  By  varying  the  azimuth  for  which  the  filter  is 
optimized  the  system  produces  a  bearing  response  pattern  which  can  be 
examined  by  an  operator  to'  determine  whether  a  target  is  present. 

The  space-time  filter  takes  the  form  of  a  set  of  K  hydrophones,  each 
connected  to  the  input  of  the  delay  line  of  a  transversal  filter  having 
M  taps.  (Note  that  this  notation  differs  from  that  used  in  most  of  the 
other  reports  in  this  series).  The  outputs  of  all  the  transversal  filters 
is  summed  to  form  the  signal  z(t),  which  after  possible  further  filtering, 
is  squared  and  smoothed  to  yield  the  observed  output.  The  adjusting 
algorithm  for  the  K(ttt-l)  weights  in  all  of  the  transversal  filters  then 
takes  the  simple  form: 


Vi  ■  v-i  *  2,i  ‘V  -  'J  V 


where  Wj  is  the  vector  of  all  of  the  tap  weights  suitably 


moexea,  v  is  a 


3 


weighting  parameter,  R^  is  the  Input  space-time  autocorrelation  function. 


4 


# 


is  the  output  z(t),  and  n  is  a  vector  of  delayed  versions  of  the  received 
th 

signal;  all  at  the  j  step  in  the  iteration.  The  process  converges  for  y^ 
of  the  fora  y^  «  y/ja  with  %  <  o  <  1.  R^  can  be  computed  if  the  target 
signal  direction  and  autocorrelation  function  are  known;  thus  it  contains  the 
information  about  desired  target  direction  that  is  needed  for  the  filter  to 
adjust  Itself. 

General  expressions  for  the  convergence  of  the  filter  have  been 
obtained  and  are  given  by  Eqs.  3.5  -  32  and  3.5  -  33  of  progress  report  39. 
These  expressions  are  too  complicated  to  yield  much  insight.  They  can  how¬ 
ever  be  simplified  by  chosing  specific  expressions  for  the  weighting 
parameter  y^ .  A  particularly  simple  expression  results  from  the  choice 

(tn)  1  th 

Yj  “  2(j+l)x  ,  where  X^  is  the  m  eigenvalue  of  the  covariance  matrix 

in 

of  the  received  signal,  and  where  the  superscript  (m)  on  y^  implies  that 
different  weights  are  used  in  different  filters.  In  this  case  it  is  found 
that  the  mean-square  error  at  the  (j+l)tl1  step  is  given  by 


2  2 
ej+l  emin 


-J _ 

<j+l)Z 


KCm-1)  e 


2 

min 


+ 


<1+1 )  2 


W  )  R  (w  -  w  ) 

-Op  y  -1  -op 


where  R  is  the  covariance  matrix  of  the  received  signal,  and  where  e  . 

y  rain 

is  the  irreducible  error  resulting  from  the  fact  that  a  continuous  filter 

is  approximated  by  a  discrete  structure.  If  the  second  term  is  initially 

larger  than  the  first  then  this  expression  indicates  an  initial  m.s.  error 

_2 

reduction  at  a  rate  j  ;  however  eventually  the  first  term  will  always 
dominate,  with  the  result  that  convergence  eventually  takes  place  at  a  rate 


As  long  as  the  noise  environment  is  stationary  the  filter  converges 
to  the  optimum  form  discussed  in  previous  progress  reports  (e.g.  #38)  in 
which  the  interference  noises  are  strongly  suppressed.  This  is  shown  not 


5 


only  analytically,  but  also  by  naans  of  a  computer  simulation  using  real 
data.  If  the  noise  environment  is  nonstationary,  partial  results  have 
been  obtained  under  the  following  conditions: 

1.  If  the  none tat ionar ity  can  be  characterised  by  changing  parameters, 
with  the  values  of  the  parameters  governed  by  a  known  dynamic  relation 
than  the  method  of  stochastic  approximation  can  be  modified  by  inclusion 
of  this  dynamic  relation.  In  fact  the  recursive  Kalman  filter  method  can 
be  applied  to  this  case  with  results  that  converge  to  those  obtained  by 
the  method  of  stochastic  approximation  in  the  atationary  case.  In  the 
nonstationary  case  the  weighting  parameter  of  the  stochastic  approxi¬ 
mation  algorithm  is  modified  and  takas  the  form  Yj  “  Yj  +  B,  where  0  is 
a  constant.  For  the  case  where  the  optimum  gain  parameter  6^  is  given  by 
the  relation 


2j+l 


a  9j  +“r 


0  <  a  <  1 

ar.d  where  the  desired  filter  output  is  given  by 


where 


Sj  ^ 


+  V, 


and  where  and  are  stationary  independent,  zero  mean,  scalar,  white 
noise  processes,  with  variances  q  and  <p  respectively,  then 

6  -  q/4> 

For  the  stationary  case  q  «  0  and  a  ■  1,  so  that  0-0,  but  in  general 
the  presence  of  a  nonzero  0  prevents  the  gradual  disappearance  of  the 
weighting  parameter  y^  >  which  would  make  tracking  of  a  changing  environ¬ 
ment  impossible.  On  the  other  hand,  the  fact  that  y^  does  not  go  to  zero 


6 


as  j  «  has  the  effect  that  the  filter  does  not  converge  in  mean  square, 
which  means  that  a  small  jitter  (proportional  to  6)  continues  to  exist  in 
the  output. 


2.  If  the  nonstationarity  is  such  that  the  optimum  gain  parameter  8^ 
satisfies  a  relation  of  the  form 


-3+1  ■  +  0  <7> 


i.e.  the  nonstationary  is  in  a  sense  "temporary"  and  disappears  with  j  •+■  » 
then  the  standard  method  of  stochastic  approximation  converges  as  long  as 


the  weighting  factor  y  has  the  form 


where 


5j  <  a  <  1 


and 


s  <  a 


Other  methods  for  dealing  with  nonstationary  environments  can  be 
envisioned,  hut  have  not  yet  been  evaluated. 


IV.  Optimum  Passive  Bearing  Estiaation  in  a  Spatially  Coherent 
Noise  Environment 

Report  No.  40  is  a  continuation  of  report  No.  37  which  was  included 
in  vol.  V.  The  earlier  report  dealt  with  the  Cramer-Rao  lower  bound  for 
determining  the  nns  bearing  error  attainable  in  an  isotropic  noise  field. 
The  present  report  extends  this  to  the  case  where  interference  is  present. 
As  in  the  earlier  report  the  analysis  initially  considers  an  arbitrary 
number  of  hydrophones  arbitrarily  spaced  on  a  linear  array,  and  arbitrary 
signal,  ambient  noise,  and  interference  spectra.  However,  in  order  to  get 
results  that  are  simple  enough  to  yield  some  insight  into  important 
parameters,  some  of  this  generality  is  sacrified;  in  particular  it  is 


attuned  that  the  ambient  noite  power  la  much  greater  than  the  signal  power; 

signal ,  Interference,  and  ambient  noise  spectra  are  taken  to  be  Identical  in 

form,  and  the  hydrophone  spacing  is  uniform.  Additional  Important  assumptions 

are  that  the  Interference  beering  is  known  and  that  the  ambient  noise  is 

independent  from  hydrophone  to  hydrophone.  Also,  as  in  the  earlier  report 

the  performance  of  the  split-beam  tracker  is  computed  to  provide  a  comparison 

between  the  possibly  unrealizable  bound  and  a  practical  instrumentation. 

Approximate  expressions  for  the  lower  bound  take  on  simple  forma  if 

the  target  and  interference  separation  is  either  very  large  or  very  small. 

In  each  case  limiting  expressions  have  been  obtained  for  the  ambient-noise 

dominated  case  (HI  <<  N)  and  for  the  interference  dominated  case  (MI  »  N) . 

The  parameter  determining  target  and  interference  separation  Is  y  *  (d/c)* 

Winax  9  “  ®  )  where  d  ia  the  hydrophone  spacing,  c  is  the  sound 

velocity  w  is  the  maximum  frequency,  6  ia  the  target  bearing  and  4>  is 
max 

the  interference  bearing.  If  the  bias  terms  are  neglected  then  for  y  »  1 
the  respective  lower  bounds  are  approximately 


r 


36kc2(N2+MNS)/S2 


T  (i)^ax  d2cos26(M4-M2)[l+(M-2)I/N] 


(MI  «  N) 


36ttc2  [N2+(M-1)NS]  /S2 


T  nT  d‘cos26[M4-M2  -  8/5  M3  +  2M] 
max 


(MI  »  N) 


The  lower  bound  for  1=0  (i.e.  no  interference)  is  the  same  as  that 
found  in  report  No.  37.  By  comparing  the  denominators  in  the  two 
expressions  above  one  can  conclude  that  the  effect  of  a  remote  interference 
is  equivalent  to  the  loss  of  2/5  of  a  hydrophone. 

For  near  interference,  such  that  y  <  1/M,  the  corresponding  results 

ar  j 


8 


(MI  «  N) 


(8  -  I)2 


r _ 36*  c2  (N2+MSN)/S2 _ 

T  o>3  d2co820(MA-M2)[l+(M4-M2)I2y2/(lON2)] 

niflx 

36ttc2MI[N+(M-1)S1/S2 
..Tw2  d2coe20(M4-M2) 

-mo  v  ' 


If  the  difference  in  denominators  (which  amounts  to  the  previously  mentioned 
2/5  M)  is  discounted,  the  lower  bound  in  the  interference  dominated  case  is 
seen  to  be  MI/N  times  as  large  as  for  large  y. 

A  modified  form  of  split-beam  tracker  employing  simple  interference 
nulling  ahead  of  the  split-beam  section  was  considered  in  progress  report 
No.  29.  For  this  tracker  the  following  results  were  obtained: 


■x  2 

(0  -  6) 


_ 96ttc2N2/S2 _ 

T  (D3  d2cos2  0  M2 (M-2 ) 2 
max 

(r, \2  2  2  /e2 

_ (64)  tic  N  / S _ 

3  2  2  2  2 

T  J  d  cos^  8  M  <M-2; 

,  wax 


The  second  of  these  expressions  is  invalid  for  y  very  near  zero  because 
some  of  the  approximations  made  to  obtain  it  break  down,  however  it  is  an 
indication  that  the  modified  split-beam  tracker  cannot  estimate  bearings 
extremely  close  to  the  interference  bearing  (since  as  a  result  of  the 
nulling  there  is  no  signal  in  this  direction)  .  In  this  respect  the  split- 
beam  tracker  performance  appears  to  fall  considerable  short  of  the  Cramer- 
Rao  bound,  which  is  finite  for  y  =  0,  albeit  consideralby  larger  than  for 
large  separation.  The  comparison  between  the  split  beam  tracker  and  the 
Cramer-Rao  bound  is  facilitated  by  computing  the  ratio  of  the  two  error 
variances,  this  is 


9 


2.67(1+2.4/*) 


y  »  1,  M  »  1 


a 

cr 


(1+8/M) 


0<y«l,  M  >>  1 


The  second  of  these  expressions  is  invalid  for  y  «  0  as  Indicated  above. 

Both  expressions  indicate  that  for  sufficiently  large  M  the  split-beam 
tracker  performance  is  fairly  close  to  the  lover  bound;  but  at  the  same 
time  they  also  suggest  that  some  improvement  might  be  achieved,  particularly 
for  small  separations  between  target  and  interference,  by  going  to  a 
different  implementation.  Such  implementations  are  currently  being  studied. 

By  plotting  curves  for  the  exact  expressions  relating  (9  -  to  y 
it  is  found  that  the  large  y  approximation  is  good  for  separations  between 
target  and  interference  bearing  greater  than  the  beam  width  of  the  array, 
defined  as  the  angle  for  which  the  signal  output  falls  to  one  half  its 
maximum  value.  This  is  roughly  true  both  for  the  C-R  bound  and  the  split- 
beam  tracker.  Also,  in  both  cases,  for  separation  smaller  than  the  beam- 
width  the  performance  deteriorates  rapidly;  however  the  deterioration  is 

considerably  more  rapid  in  the  case  of  the  split-beam  tracker. 

A 

The  error  variance  decreases  with  M  for  large  separations  in  both 
cases,  and  the  C-R  bound  decreases  with  M  for  zero  separation  between 
target  and  interference  bearing.  Thus  theoretically  the  error  can  be  made 
arbitrarily  small  for  both  large  and  small  separations  by  letting  M  become 
sufficiently  large.  Here  it  must  be  noted  however,  that  for  a  fixed  size 
array  the  assumption  of  zero  ambient  noise  correlation  between  adjacent 
hydrophones  will  become  invalid  for  very  large  M. 


V.  Space  Time  Properties  of  Sonar  Detection  Models 

In  all  previous  work,  and  in  most  analyses  of  sonar  in  the  literature 
the  array  configurations  are  taken  as  given.  In  a  good  many  of  the  analyses 


10 


reported  in  the  previous  volumes,  in  fact,  the  arrays  have  been  assumed  to 
be  linear  and  with  equally  spaced  hydrophones.  The  question  naturally 
arises  as  to  whether  the  performance  of  an  array  with  a  given  number  of 
hydrophones  might  not  be  improved  substantially  by  seeking  an  optimum 
configuration. 

It  turns  out  that  the  attempt  to  find  algorithms  for  determining  the 
optimum  placement  of  hydrophones  involves  searches  through  a  3K -dimensional 
continuum,  where  K  is  the  number  of  hydrophones.  For  the  large  values  of  K 
that  ar  e  of  practical  interest  such  a  search  is  an  extremely  formidable 
undertaking  for  which  there  is  no  guarantee  of  success.  Hence  it  becomes 
very  desirable  to  obtain  first  some  estimate  for  the  ultimate  performance 
of  which  an  array  with  a  large  number  of  arbitrarily  spaced  hydrophones  is 
capable.  Such  an  estimate  is,  however,  even  conceptually  possible  only  if 
in  the  limit  of  continuous  observation  (i.e.  as  K  “)  the  signal  model 
remains  nonaingular.  Many  comnonly  used  models  turn  out,  in  fact,  to  be 
singular;  i.e.  as  K  +  ®  it  becomes  possible  to  determine  the  presence  or 
absence  of  the  signal  with  zero  error  even  though  both  the  array  size  and 
observation  times  are  finite.  For  this  reason  such  models  are  physically 
not  completely  realistic  (which  is  not  to  say  that  they  are  not  useful),  and 
it  is  desirable  to  obtain  general  conditions  guaranteeing  that  a  given  model 
be  nonsingular.  This  is  in  essence  what  is  done  in  report  #41. 

The  approach  taken  is  based  on  the  realization  that  any  communication/ 
detection  (C/D)  system  can  be  represented  as  a  series  of  mapping  operations; 
i.e.  an  encode  operator  e  maps  source  characters  from  the  space  A  of  source 
characters  into  the  space  w  of  channel  signals  which  is  in  turn  mapped  by  a 
transmit  operator  t  into  a  space  V  of  receivable  signals,  etc.  until  the 
final  mapping  produces  an  estimate  a  of  the  source  character,  which  is  an 
element  of  the  space  A.  The  operators  are  stochastically  determined,  hence 


11 


can  be  considered  as  being  themselves  elements  of  probability  spaces  E,  T, 
etc.  The  notation  is  generalised  by  denoting  the  space  of  source  characters 
by  and  the  space  of  mappings  of  into  S2R+^  by  S2{,  vhere  K  *  1,2,3...L. 

A  marginal  probability  measure  y^  nay  then  be  defined  on  each  of  the  spaces 
where  i  “  1,2, 4, 6. . .2L.  These  measures  will  induce  probability  measures  in 
the  remaining  spaces  S2K+^,  K  -  1,2,3...L;  furthermore  they  induce  conditional 
measures  of  the  form  the  measure  induced  in  conditioned  on  the  trans¬ 
mission  of  a  signal  a. 

A  class  of  models  having  particularly  simple  properties  are  the  factor¬ 
able  models.  In  this  class  the  probability  measure  y  defined  on  the  product 
space  S  ■  Sj  x  S2  x  5^  ■ , .  S2L  is  given  by 

y  ‘  U1  w2  y4  U2L 


This  form  of  the  measure  implies  that  the  stochastic  operations  of  the  model 
are  independent.  Most  of  the  models  used  in  the  usual  communication  and 
detection  studies  are  factorable  in  this  sense. 

The  central  theoremes  concerning  the  singularity  of  models  are  then 
given  by  Corollary  1  of  Theorem  9,  and  Theorem  10  of  chapter  2: 

If  S.  -  S_T  .  are  countable  with  discrete  metric,  then  the  model  M 
1  2L+1 

r  s 

is  singular  if  and  only  if  the  conditional  measures  U,TJ.,  and  u2L+^  are 


2L+1 


orthogonal  for  every  pair  of  characters  r  and  s  in  for  which  v(r)  >  0 
and  such  that  r  ^  s . 

If  M  is  a  factorable  model  then  it  is  singular  if  and  only  if  u^kfl 
and  are  orthogonal  for  all  k  <  L,  and  r  4  s. 

The  implication  of  these  theorems  is  that  for  a  factorable  model  to 
be  singular,  singularity  most  be  present  in  the  first  stage,  and  it  must 
be  preserved  by  all  subsequent  transformations  or  mappings.  While  this 
might  appear  to  be  a  rather  strong  requirement  which  would  have  the  effect 


12 


of  making  most  practical  models  nonsingular,  it  turns  out  that  many  of  the 
usual  encoding  transformations  considered  In  communications  processes 
preserve  singularity,  so  that  singular  models  are  actually  more  common  than 
might  be  supposed.  In  particular,  it  is  shown  in  Theorem  1  of  chapter  3 
that  additive  stages  are  usually  singularity  preserving.  On  the  other  hand 
it  Is  shown  in  Theorem  2  of  this  chapter  that  if  stage  k  la  such  that  the 
support  space  of  the  measure  ji  is  a  subspace  of  the  previous  space 
and  if  y^  is  independent  of  y^  for  all  i  <  2k  then  the  model  M  is  noneingular. 
Particularly  simple  statements  can  be  made  if  the  conditional  measures 
are  Gaussian*  In  this  case  one  can  use  the  fact  that  two  Gaussian 
measures  ore  either  orthogonal,  or  they  are  equivalent.  Furthermore 
according  to  Theorem  A  of  chapter  3  two  Gaussian  distributions  P  and  Q  are 
equivalent  if  and  only  if 

1.  m( ’)  e  H(rQ) 

2.  r  has  a  representation  T  (s,t)  *  I  X  e,  (s)  e,  (t)  where  the  set  of 

P  p  k  k  k 

functions  e^(t)  is  a  complete  orthonormal  set  in  the  reproducing 
kernel  Hilbert  space  H(rq)  and  ^(l-X^)2  <  ”,  and  >  c  >  0  for  all  k. 

In  this  theorem  Tp  and  Tq  are  covariance  of  the  distributions  P  and  Q,  with 
mean  functions  m(*)  and  0  respectively.  As  a  consequence  of  this  theorem 
singularity  may  occur  when  the  mean  function  of  the  signal  process  lies 
outside  the  space  H(Tq);  i.e.  if  P  has  a  linear  projection  outside  the  support 
space  of  Q;  if  this  is  not  the  case  singularity  may  still  occur  if  some  noise 
eigenvalues  are  zero  or  if  the  signal  and  noise  processes  do  not  put  almost 
the  same  energy  into  all  but  a  finite  number  of  dimensions  (or  eigenvalues). 

Applications  of  this  theory  have  been  made  to  two  simple  sonar  situations. 
The  first  of  these  is  one  dimensional:  a  source  is  either  to  the  right  or  to 
the  left  of  the  observer,  and  the  observer  can  determine  the  direction  of  wave 
propagation.  This  situation  is  singular,  even  if  the  velocity  of  propagation 


13 


Is  random,  if  random  noise  is  added,  and  if  other  random  effects  are  present, 
as  long  the  randomness  Is  not  sufficient  to  make  a  right-going  wave  look 
like  a  left  going  wave. 

The  more  intereating  problem  of  sonar  In  three-dimensions  has  also 
been  analyzed  with  the  result  that  the  usual  model  xn  which  the  signal  wave- 
front  isa  deterministic  function  of  the  coordinates  is  also  shown  to  be 
singular.  This  explains  the  result  of  Vanderkulk  that  as  the  number  of 
hydrophones  goes  to  infinity,  the  array  gain  becomes  infinite  and  detection 
becomes  perfect.  This  result  is  shown  to  hold  even  if  white  noise  is  added 
at  each  hydrophone;  to  produce  a  nonsingular  model  it  is  necessary  to 
introduce  some  perturbations  into  the  wavefront.  The  effect  of  perturbed 
wavefronts  is  currently  being  analyzed. 


Id 


a' f  a*  U 


^ETVtW^r 


THE  OPTIMUM  DETECTOR  FOR  NONISOTROPIC  NOISE 


by 

Franz  B.  Tuteur 


Progress  Report  Mo.  38 
General  Dynanlcs/Electric  Boat  Research 


September  1968 


I )  EPA  R  I  M  (IN  !  oi  ENGINEERING 
AND  APPLIED  SCIENCE 

YALE  UNIVERSITY 


Sunnary 

The  feasibility  of  using  tapped-delay-line  filters  to  synthesize  the 
optimum  processor  is  investigated  in  this  report.  It  is  found  that  the 
most  severe  requirement  that  is  placed  on  the  delay  lines  arises  from  the 
necessity  of  steering  the  array  in  steps  that  are  commensurate  with  the 
resolution  of  which  the  array  is  theoretically  capable.  If  delay  lines  to 
accomplish  this  can  be  fabricated  then  the  additional  comnlexitv  required 
in  the  construction  of  an  optimal  filter  is  relatively  minor;  that  is,  it 
requires  delay  lines  of  no  greater  complexity. 


The  Optimum  Detector  for  Nonisotrooic  Noise 


T .  Introduction. 

In  Progress  Report  No.  33  (Ref.l)  the  effect  of  localized  noise  sources 
on  the  performance  of  the  optimum  (likelihood-ratio)  detector  of  directional 
Gaussian  signals  was  investigated.  In  the  present  report  the  structure  of  the 
optimum  detector  la  considered. 

The  nomenclature  used  in  this  progress  report  is  exactly  the  same  as  that 
used  in  Ref.  1,  which  is  assumed  to  be  available  to  the  reader. 

II .  General  Form  of  the  Optimum  Detector 

The  general  form  cf  the  optimum  detector  is  contained  in  Eos. (22)  and  (23) 
of  Ref .  1.  If  the  output  of  the  filter  is  designated  by 

u  *  log  LR  -  C  (1) 

then  the  optimum  detector  structure  has  the  form 

WT 

u  =n?.^  |  H^(n)X/n)  |  ^  (2) 

/T?vr  3_1(n)V*(n) 

where  H (a)  »  — 7,-rr  -  "•  ~  1  •“  (3) 

CHn>  /i  +  s(n)G  (n)/K(n) 
o 

and  where  the  optimum  array  cain  GQ(n)  is  defined  bv 

G  (n)  =  VT(n)Q  hn)V  (n)  (4) 

o  —  — 

If  th<-  time  of  observation  T  is  large,  the  summation  in  (2)  can  be  converted  to 
an  Integral  in  the  frequency  variable  f ;  hence  the  detector  output  u  takes  the 


w  T  9 

u  ~  T  f  |H  (f)X(f)|  d£ 


ere  by  direct  analogy  with  Eq.(3) 


H(f>  = 


a  L(f)  V*(f) 


/l+S(f)G  (f)/N(f) 

o 


In  this  expression  S(f)  and  N(f)  are,  respectively  the  signal  and  noise  spectral 

densities,  and  (£(f)  is  the  noi3e  spectral  matrix  whose  elements  are  the  cross 

spectral  densities  of  the  noise  voltages  received  on  the  different  hydrophones 

of  the  array.  7(f)  is  the  steering  vector,  given  by 

J2nfT. 


7(f) 


cic: 


J  2uf  T, 


Si6 


M 


(7) 


where  the  are  hydrophone  gains  and  the  signal  delays.  Also,  the  array 


gain  becomes 

Gq(£)  =•  VT(f)2-l(f)V*(f)  (8) 

If  the  bandwidth  W  is  very  large,  Parseval's  theorem  can  be  used  to  con¬ 
vert  Eq .  (5)  into 

T 

u  :  W  /| z(t) |^dt  (9) 

o 

T 

where  z(t)  is  the  inverse  Fourier  transform  of  H  (f)X(f).  This  implies  that 
u  can  be  obtained  from  a  circuit  of  the  form  shown  in  Fig.  1. 


/ 

i 

i 


Figure  1.  Likelihood-ratio  Detector 

In  this  figure  Hc(f)  is  a  filter  containing  the  common  frequency-sensitive 
component  inH(f),  namely 


A-3 


Vf) - 7==“i-=====r-  (10) 

N(f)/l+S(f)GQ(f)/N(£) 

The  Individual  filters  (f ) . . .H^(f )  are  then  respectively  the  first, 

second, . rich  row  of  the  matrix  product  (f)V*(i;), 

111 ■  le called  Structure  of  the  Filter  with  Directional  Interference. 

f\S  in  section  IV  of  Ret.  1  we  assume  that  the  noise  component  consists  of 
an  isotropic  part  ana  a  number  of  point  sources.  Then  the  noise  spectral 
matrix  has  the  form 

Vf)  R  *  T 

SCO  =  tg^Cf)  +  T  Kr(f)V  (f)VX(f)l  (11) 

r=l  r 

Ir(f) 

where  K^ff)  =*  N~’(fy  anc*  where  lr(f)  is  the  spectral  density  of  the  rC  noise 
source . 

Since  the  frequency  weighting  filter  of  Eq.(10)  is  common  to  all  channels, 
the  essential  operation  performed  by  the  processor  is 

K' (f)  =  _q"1(f)V*<f)  (12) 

The  Inversion  of  the  spectral  matrix  in  (12)  can  be  accomplished  by 
moans  of  Eq.(35)  in  Ref.  1,  The  result  is 

r<£>  03) 

o 

whore  tho  dependence  of  ,  V ,  K^,  G,  and  £  on  f  has  been  suppressed  for 
convenience.  The  notation  of  Ref.  1  is  used,  with  n  replaced  by  f  in  all 

cases.  Note  that  the  scalar  multiplying  factor  N(f)/N  (f)  can  be  absorbed 

o 

into  C f )  of  Eq.(10)  so  that  the  essential  oneration  is  that  indicated  by 

tho  expression  in  the  braces,  {...). 

To  gain  some  insight  into  the  implications  of  this  result  we  consider  some 
simple  examples.  In  all  cases  we  assume  that  ^(f)  ■  I,  the  unit  matrix; 
this  implies  that  there  is  no  interphone  correlation  of  the  isotropic  noise 


component . 


A -4 


Suppose  first  that  there  is  only  a  single  interference.  Then  the  ex¬ 


pression  inside  the  braces  in  Eq.(13)  becomes 


H,!(f) 


N  (f) 
N(f) 


H'(f) 


* 

V 


— O 


KiGio  * 
1+KiM  -1 


r 

r-  .  (Dm 

1 

f-1. 

E 

e 

K,  G.  „ 

e 

1  10 
i+KjM 

:  ,  a) 

"J 

e 

e 

where  oj 

r. 


J10 


2irf  and  where 
H 

T.  e 
k-1 


010«> 


(U) 


(15) 


The  unsuperscripted  t's  are  the  signal  delays,  while  the  superscriotad  t's 
are  the  interference  delays.  Thus  Eq.(14)  indicates  that  the  filter  forms 
two  beams,  one  steered  on  the  signal  and  one  on  the  interference,  the  outnut 
of  the  interference  beam  is  passed  through  a  filter  with  transfer  function 
K1G10^1-  +  K^M)  and  the  result  is  subtracted  from  the  signal  beam  outnut. 

A  possible  system  block  diagram  is  shown  in  Figure  2.  This  svstem  is  quite 
similar  to  that  proposed  by  V.C.  Anderson  [4]  and. reported  on  by  Cox  [ 5 ] . 


outnut 

Figure  2.  Optimum  Filter  for  Single  Interference. 

A-o 


This  filter  can  be  constructed  using  M  tapped  delay  lines  to  generate  the 
delays  and  in  each  hydrophone  channel,  and  two  additional  delay  lines 

to  generate  the  filter  functions  H_(f)  and  K^(f )G10<f)/[14HK^(f) 1 .  The  use 
of  tapped  delay  linos  for  the  construction  of  variable  filters  is  discussed 
in  detail  in  Refs. [2]  and  [3].  and  it  has  the  advantage  that  they  permit 
automatic  adjustment  by  relatively  simple  adaptive  algorithms. 

It  is  clear  that  the  delay  lines  used  in  each  of  the  hydrophone  channels 
must  have  a  sufficient  number  of  taps  and  a  sufficiently  small  inter-tap 
spacing  to  permit  steering  to  any  one  of  the  distinct  beams  that  can  be  re¬ 
solved  by  the  array  system.  In  this  connection  it  should  be  noted  that  if 
interference  elimination  were  not  a  factor  mechanical  steering  of  the  array 
could  be  used  to  reduce  the  length  of  these  delay  lines.  However,  since 
interferences  may  cone  from  any  direction,  interference  elimination  requires 
that  delay  lines  of  the  maximum  length  needed  to  steer  the  array  through  360° 
bo  used  in  each  channel. 

A  discussion  of  delay-line  characteristics  is  given  in  Appendix  B,  and 
it  is  shown  there  that  the  number  of  taps  needed  tends  to  be  very  large. 
Specifically,  for  a  linear  array  with  M  hydrophones  spaced  uniformly  a 
distance  d  apart  the  number  of  taps  is  given  by 

j.  _  B  d  M(M-l) 

2/6  c 

where  B  is  the  signal  bandwidth  and  c  the  velocity  of  sound.  Using  typical 
values  of  B  =  2*  x  5000  rad/sec.,  d  «  2ft,  c  =>  5000  ft/sec,  and  M  *=  100,  this 
works  out  to  K  -  26400  taps.  Also,  it  is  shown  in  Appendix  B  that  the  tap 
increment  under  these  conditions  should  be  on  the  order  of  1.55  nsec.  The 
number  of  taps  needed  appears  to  be  well  beyond  currently  available  hardware. 


A-6 


The  function  G^^(f)  given  in  Eq.(15)  is  most  easily  constructed  from  a 
delay  line  having  at  least  M  taps  giving  the  delays  x  ^  -  x  ;  k  -  1...M. 

K  *C 

Since  both  target  and  interference  can  be  located  anywhere  in  azimuth  this 
line  must  have  the  same  resolution  as  that  needed  in  the  individual  channels. 
Furthermore,  the  maximum  delay  needed  is  at  least  twice  that  needed  in  the 
channels.  This  is  easily  demonstrated  by  considering  a  linear  array  in  which 

-  ij,  *=  k  ^•(sinei  -  sin  0) 

where  0^  is  the  interference  direction  and  6  the  signal  direction.  The  maximum 
value  of  delay,  obtained  for  k  -  M,  0  -  -ir/2  and  0  =  ir/2  is  2Md/c  while  the 
minimum  value  is  -2Md/e.  A  delay  line  can  only  produce  positive  delays:  the 
effect  of  negative  delays  can  be  obtained  by  inserting  a  fixed  delay  line  into 
the  line  from  the  upper  summing  junction  in  Fig.  2,  as  discussed  in  Ref. 5. 

If  this  is  done  the  range  of  delays  needed  in  the  tapped  delay  line  under 
consideration  is  2Md/c  compared  to  a  maximum  delay  of  (M-l)d/c  needed  in  the 
channels . 

The  additional  frequency  weighting  K1(f)/[1  +  MK  (f)]  is  a  relatively 
minor  modification  in  the  filter  characteristic.  It  can  be  imolemented  by 
applying  different  weights  to  the  tap  outputs  before  they  are  summed,  as 
explained  in  Refs. (2]  and  [3].  Thus  the  entire  filter  function 
K^(f)G^g(f )/[l  +  MK^(f)]  can  be  constructed  from  a  standard  tapped  delay  line 
filter  using  a  delay  line  with  twice  as  many  taps,  hence  twice  as  long,  as 
the  delay  lines  used  in  the  channel.  It  therefore  involves  no  major  additional 
design  problems. 

Actually,  it  is  possible  to  redraw  the  block  diagram  of  Figure  2  in  such 
a  way  as  to  cut  the  length  of  the  delay  line  needed  to  generate  G^g(f)  in 
half;  i.e.  to  make  it  no  longer  than  the  lines  used  in  the  individual  channels 
to  steer  the  array.  Such  a  block  diagram  is  shown  in  Figure  3. 


A-7 


Figure  3:  Modified  Block  Diagram  for  Single-Interference  Filter. 

It  can  be  verified  that  this  system  produces  the  same  output  as  Figure  2,  but 
the  delay  line  at  the  lower  right  of  the  diagram,  which  is  needed  to  generate 
the  function  6^g(f)  has  taps  only  at  delays  etc *  father  than  at 

the  delay  differences  “  t  ,  etc*  Mence  the  maximum  length 

of  this  line  needs  to  be  no  greater  than  the  lines  used  in  the  individual 
channels . 


A-8 


X 


Finally,  tho  function  H  (f)  must  be  considered.  Since  this  is  a  much 

c 

;  simpler  function  than  G^Q(f)  it  can  be  synthesized  by  means  of  a  separate 

tapped-delay-line  filter  with  only  a  modest  number  of  taps.  Alternatively, 

Hc(f)  can  be  generated  in  each  channel  by  summing  some  of  the  tap  outputs 

of  the  delay  lines  used  for  steering  the  array. 

If  the  signal  and  noise  spectra  are  similar  in  form  and  if  the  SNR 

S(f)G  (f)/N(f)  is  small,  H  (f)  can  be  omitted  entirely, 
o  c 

Two  Interferences 

With  two  interferences  the  expression  in  the  braces  of  Eq.(13)  becomes 


explicitly: 

"1+K1G11 

•^2°12  ' 

-1 

"Vio  ■ 

_/^lK2Gl2 

1+K2G22 . 

*^2G20 

If  it  Is  again  assumed  that  (J^  *  i»  then  this  becomes 

H"  -  Vj*  -  A1(f)V1*  -  A2(f)V2*  (16) 

K  (f)[l+MK  (f)]G.-(f)  -  K. (f)K_(f)G.  (f)G_n(f) 

where  A,  (f) - ± ^ ^ - - - i - -  (17) 

I+M[K1(f)+K2(f)}  +  [M  -|G12(f)r]K1(f)K2(f) 

.  K2(f)ll+MK1(f)lG20(f)  -  K1(f)K2(f)G2  (f)Gl0(f) 

and  A  (f)  -  — - - ^ - - -  (18) 

l+M[K^(f)+K  2<f )  3  +  CM* -|G12(f)riK1(f)K2(f) 

Thus  the  system  takes  the  form  shown  In  Figure  4.  If  II  is  large,  and  if  the 


Interference  sources  are  reasonably  well  separated  It  is  shown  in  Ref. I  that 

2  2 
| G^2 (f ) |  ^  M  and  can  therefore  be  neglected  relative  to  M  .  Under  these 

conditions  the  denominator  factors  giving 


Ax(f) 


K  (f) 

'  1+MK^f)  G10<f) 


K1(f)K2(f)G12(f)G2Q(f) 

[l+MK1(f)][l+MK2(f)l 


and 

K?<f)  K  (f)K  (f)G_.(f)G  (f) 

A2(f)  '  I+MK2”(f)  G20(f)  ~  [l+MK1(f)][l+MK2(f>] 


(19) 


(20) 


A- 9 


A.  further  simplification  can  be  made  if  one  assumes  that  MK^(f)  and 
MKjCf)  are  large.  If  this  is  not  true  then  the  interference-to-ambient  noise 
ratio  is  not  really  large  enough  to  make  interference  elimination  necessary 


or  worthwhile.  With  this  additional  assumption 


A1(f)  - 


Gio<f) 


Gl2(f)G20(f) 


M 


1  M 
~  E 

M  k»l 


M 


j2irf(Tfc 


(1) 


1  M 

-  ^  , 


i2*t<rk 


(1) 


-  T 


(2) 


+  T 


(2) 


-  V 


(21) 


and 

A2(f) 


.  G20(f)  G21(f)G10(f) 
M  '  M2 


H  j2nf(T  <2) 
l  e 
k- 1 


M  H 

z  z 

k>»i  j=i 


j2irf(x 

e 


(2) 

k 


T  (1)  +  T  (1) 

k  +  Tj 

(22) 


V 


Xt  is  easily  seen  that  the  filter  functions  A^(f)  and  A^Cf)  can  be 

constructed  from  tapped  delay  lines  with  weights  applied  to  the  tap  outputs. 

Since  the  single  summation  term  has  already  been  discussed,  we  consider  only 

the  double  summation.  Again,  It  is  clear  that  the  resolution  needed  is  the 

same  as  in  the  delay  lines  used  in  the  channels.  Also,  if  one  examines  all 

extreme  values  of  and  one  can  show  that  for  a  linear  array 

the  total  delay  can  range  from  -2Md/c  to  2Md/c,  Since  the  delays  required 

by  A^(f  )  may  be  the  negative  of  those  required  by  A2(f)  it  is  necessary  to 

use  a  fixed  delay  of  2Md/c  in  the  signal  beam  channel  and  tapped  delay  lines 

of  length  4Md/c  in  each  of  the  interference  channels.  Thus  the  tapped  delay 

lines  must  be  four  times  as  long  as  those  used  in  the  hydrophone  channels. 

2 

Since  there  are  M  terms  to  be  summed  it  might  appear  that  at  least 

2 

M  taps  would  be  needed  on  these  delay  lines.  Although  it  is  shown  in 

Appendix  B  that  tapped  delay  lines  used  for  linear  or  circular  arrays  should 

2 

have  considerably  more  than  M  taps,  this  is  not  necessarily  true  in  other 
array  geometries.  However,  if  a  line  with  sufficient  resolution  to  resolve 
distinct  beams  of  the  array  system  does  not  have  enough  taps,  it  simply 
means  that  some  of  the  terms  in  the  double  summation  are  identical,  at  least 
to  the  accuracy  of  the  delay  increments.  Hence  these  terms  will  be  more 
heavily  weighted  in  the  sum.  Thus  it  appears  that  the  delay  line  required 
for  the  double  summation  needs  to  be  no  more  complicated  than  that  used  for 
the  single  summation.  Note  that  it  is  just  as  easy  to  implement  the  exact  forms 

J 

of  A^(f)  and  A^Cf)  as  the  approximate  ones  given  in  Eqs.(21)  and  (22).  If 
the  additional  frequency  weighting  required  by  the  exact  function  is  reasonably 
smooth  it  will  call  only  for  small  changes  in  the  weights  applied  to  the  tap 
outputs.  This  is  true  even  if  the  effect  of  in  the  denominator 

of  Eqs .  (17)  and  (18)  is  taken  into  account,  because  the  oranges  introduced 

A 

by  this  terra  are  no  more  rapid  chan  those  produced  by  the  numerator  terms . 

A-ll 


Also  the  function  H  (f)  nay  as  well  be  combined  with  A, (f)  and  A_(£).  Thus 
c  1  i 

the  optimum  filter  capable  of  handling  two  interference  signals  would  con¬ 
sist  of  M  tapped  delay  lines  of  unit  length  and  two  delay  lines  of  four  times 
this  length.  In  addition  a  fixed-length  delay  line  would  be  needed  in  the 
signal-beam  line  as  discussed  in  the  previous  example.  The  number  of  tans  in 
a  unit-length  tapped  delay  line  is  that  given  in  Eq.  (A-35)  of  Appendix  B. 

It  is  undoubtedly  possible  to  rearrange  the  block  diagram  to  make  more 
efficient  use  of  the  delay  lines  as  was  done  in  the  previous  example. 

However,  since  such  a  procedure  would  not  reduce  the  complexity  of  the  delay 
lines  by  any  order  of  magnitude,  this  matter  is  not  pursued  here. 

More  chan  Two  Interferences 

If  the  assumption  CJ^ff)  *  I  is  used  in  Eq.(13)  the  expression  in  braces 
becomes: 


H"(f>  -  (vo*  -  {^1v1*:^K2v2*:....:ARyH*3[i  +  cf1*} 

-  A2(fjv2*.:...-AR(f)vR*  (23) 

where  Ar(f)  =  ^Kr<f)  [rtil  element  of  [!_  +  G] 

Since  [!_  +  G j  Is  now  an  R  dimensional  matrix  it  is  clear  that  instead  of 
double  summations  of  the  sort  appearing  in  Eqs . (21)  and  (22)  A^(f)  now 

involves  R-fold  summations.  A  typical  form  of  such  a  summation  is  the  three- 


M  M  M  J2irf(x  (1)  -  x.(2>  +  t  *2) 
fold  summation  life  "  '  J 


x  <3)  -  x  ). 
i  t 


k-l  j-1  fc-1 

Although  it  is  somewhat  difficult  to  examine  all  terms  of  this  sort,  it  is 

fairly  clear  that  the  maximum  delay  that  can  occur  is  twice  the  value 

required  to  steer  the  array  through  360°.  Also,  it  is  necessary  to  account 

for  the  fact  that  these  delays  can  be  positive  or  negative.  Thus  each  one 

of  the  A  (f)  tor  H  (f)  A  (f ) ]  can  be  generated  by  a  delay  line  of  four  unit 
r  c  r 

R 

lengths .  The  R-fold  summation  will,  of  course,  require  the  addition  of  M 
terms;  however  as  was  noted  in  connection  with  the  double  summation,  many  of 


A-12 


these  terras  are  identical,  and  therefore  the  variable  weights  applied  to  each 

tap  output  should  permit  the  filter  functions  to  be  generated  without  the 

need  for  a  larger  than  normal  number  of  taps. 

Assuming  that  a  quadruple-length  delay  line  can  be  constructed  by 

connecting  four  single-length  lines  in  series  we  see  that  the  total  number 

of  tapped  delay  lines  is  4R+M.  In  addition  at  least  one  fixed  delay  line 

is  needed  to  permit  the  generation  of  negative  delays. 

It  will  be  noted  that  none  of  the  block  diagrams  presented  so  far  are  in 

the  form  of  Figure  1.  Since  the  block  diagrams  suggested  in  Ref.  3  are  of 

this  form,  it  is  of  some  interest  to  consider  the  arrangement  shown  in 

Figure  5,  which  is  essentially  in  the  form  of  Figure  1. 

By  inspection  of  this  figure  the  transfer  function  H^ff)  for  k  *  1...M 

is  given  by  , 

-j2irfT  R  -jZirfT.^' 

H.(f)  -  e  +  £  e  A  (f)  (24) 

r-1 

It  is  clear  that  a  tapped-delay  line  filter  that  can  implement  A^Cf)  for 
r  *  l....Rwill  have  sufficient  flexibility  to  implement  ^(f).  Furthermore, 

the  post-summation  filter  H  (f)  shown  in  Figs.  1  and  5  can  be  moved  into 

c 

each  of  the  hydrophone  channels,  and  the  delay  line  filter  that  can  implement 

H^tf)  can  also  implement  H^fjH  (f).  Thus  it  appears  that  a  delay  line  of 

four  time  unit  length  in  each  hydrophone  channel,  with  adjustable  weights 

on  each  tap,  should  suffice  to  generate  the  optimum  filter  function.  This 

arrangement  would  therefore  call  for  4M  unit-length  lines,  where  unit  length 

refers  to  a  line  capable  of  providing  all  the  delays  needed  to  steer  the 

array  through  360°.  In  Ref.[l]  it  was  suggested  that  the  number  of  single 

interferences  that  can  be  eliminated  by  tlie  kind  of  system  discussed  here 

is  on  the  order  of  /$!.  Therefore,  since  for  all  M  2  4M  >  4/^1  +  M  the 

block  diagram  of  Fig. 5  is  less  efficient  in  the  use  cf  delay  lines  than  that 

of  Fig. 4.  However,  it  is  again  true  that  no  order-cf-magnitude  difference  is 
involved . 


A-13 


Conclusion 


If  correlation  of  the  isotropic  noise  components  between  adjacent  hydro¬ 
phones  is  negligible  then  the  optimum  filter  is  shown  to  consist  of  an  array 
system  capable  of  forming  a  signal  beam  and  R  additional  beams  that  are 
steered  on  each  one  of  the  interference  sources.  After  passing  through  rather 
complex  filters  these  outputs  are  subtracted  from  the  signal  beam-former  out¬ 
put  and  the  result  is  then  passed  through  a  post-suimaation  filter,  squared, 
and  averaged . 

Filters  of  considerable  complexity  can  be  synthesized  automatically  by 
use  of  tapped  delay  lines.  The.  tap  outputs  are  individually  weighted  and 
then  summed  to  provide  the  filter  output;  the  weighting  can  be  accomplished 
by  a  simple  computer  which  implements  an  adptive  algorithm.  Such  adaptive 
filters  have  beer,  considered  by  Luckey  [2]  and  by  Chang  and  Tuteur  [3]. 

If  tapped  delay  lines  are  used  to  generate  the  R  +  1  beams  that  must  be 

formed  by  the  system  it  is  shown  that  the  number  of  taps  required  is 
2 

proportional  to  BD  /cd  where  B  is  the  signal  bandwidth,  D  is  the  array  si*e, 
d  is  the  interphone  spacing  and  c  is  the  velocity  of  sound.  Typically,  for 
a  linear  array  with  100  hydrophones  the  number  of  taps  required  is  on  the 
order  of  20000  or  more;  which  appears  to  be  well  beyond  current  technology. 
However,  since  this  requirement  arises  primarily  from  the  need  to  produce 
several  beans  it  is  shared  by  suboptimum  processors  such  as  the  simple 
multiple  beam  former.  In  fact,  it  is  also  shown  that  the  additional  com¬ 
plication  needed  to  make  a  conventional  system  into  an  optimum  one  is 
relatively  minor.  For  a  system  having  M  hydrophones,  and  capable  of  elimi¬ 
nating  R  interference  sources,  the  optimum  system  would  require  4R+M  delay 
lines,  while  the  simple  beam  former  would  require  M  delay  lines.  The 


A-14 


conclusion  seems  to  be  therefore  that  if  tapped  delay  lines  can  be  built 
to  steer  the  array  satisfactorily  then  the  optimum  processor  can  be  built 
fairly  easily  by  the  use  of  a  few  additional  delay  lines  plus  some 
relatively  simple  associated  circuitry. 


List  of  References 


1.  F.B.  Tuteur,  "The  Effect  of  Noise  Anisotropy  on  Detectability  in  an 
Optimum  Array  Processor",  General  Dynamics/Electric  Boat  Research 
Progress  Report  No.  33  (September  1967). 

2.  R.W.  Lucky,  "Automatic  Equalization  for  Digital  Communication''  Bell 
System  Technical  Journal  XL1V,  No.  4,  (April  1965) ,  op. 547-588. 

3.  J.H.  Chang  and  F.B.  Tuteur,  "Methods  of  Stochastic  Approximation 
Applied  to  the  Analysis  of  Adaptive  Tapped  Delay  Line  Filters  General 
Dynamics/Electric  Boat  Research  Rept.,  No.  34  (October  1967), 

4.  V.C.  Anderson,  "Steerable  Null  Processing",  Proc.  23  Naval  Symposium 
on  Underwater  Acoustics  1965  (429-433). 

5.  H.  Cox,  "Array  Processing  Against  Interference",  Naval  Ship  Systems 
Command,  Washington  D.C.,  October  1967. 

6.  P.M.  Woodward,  "Probability  and  Information  Theory  with  Applications 
to  Radar",  McGraw-Hill  Book  Company,  Inc.  ,  New  York,  1953,  page  lnl. 


A- 15 


Appendix  A 


The  Number  of  Distinct  Beams  Produced  by  an  Array 
Consider  a  conventional  array  having  M  hydrophones  as  shown  in  figure  Al. 
For  the  sake  of  simplicity  it  is  assumed  that  the  only  processing  done  is  to 


Figure  Al •  Conventional  Array. 


Lhy  eh..-  signal  from  each  hydrophone  in  order  to  steer  the  array  the  delayed 
sign  il s  nr.'  then  sunned,  the  result  is  squared,  and  averaged. 

The  received  signal  is 

x(t)  =  js(t)  +  n(t)  ^.1 

..  x  ( t )  »  f:<j(t) . x.t(t)lT, 

n(t)  =  [  n  ^  ( t ) . nM(t)]T, 

s(t)  •  [s^tl . sM(t)]T. 

For  the  purpose  of  this  discussion  we  consider  only  s_(t),  which  is 
assumed  to  bo  expanded  in  a  Fourier  series  so  that 


WT 


(t) 


r  (n) 


tv-  -  WT 


s  V(n) 
r\— 


IvJ  t 

J  n 


A. 2 


I'O  t 

n  1 


j  ui  t, 
n  M 


gives  the  signal  direction. 
A- 16 


The  effect  of  the  delays  following  the  hydrophones  is  to  multiply  the 
signal  vector  by  a  steering  matrix  with  the  result  that  after  summation  the 


signal  s^Ct)  is  given  by 

M  WT  ju (t  -  T  *)  t 

s  (t)  -2  2  s  e  n  K  ken 

k-1  n— Wt  n 


A. 3 


tH 

where  t  '  is  the  delay  imparted  to  the  signal  at  the  k  hydrophone.  After 
squaring 


j[u  (t+T  -T.  ’)+(0 m(t+t.-T  ')] 

8,(0  “2  1  I  2  S  8  e  n  k  k  m  4  *  A.4 

k=l  f,-l  n--TW  m*-TV)  n  m 


M  M  TW  TW 

2  2  2 


■*  > 


The  effect  of  the  integrator  is  to  pass  only  the  dc  component  of  this  signal 
which  is  obtained  by  setting  m  *  -n  in  the  last  summation  (we  make  use  of 


the  usual  conventi  on  that  <u~m=  -w  ) .  Thus 

m 


....  .  -  «  I.  ,* 


M  M  TW 

a,  -  2  2  **  2  | a  | 

k=L  t=l  n— TW  n 


M  M  H  2 

2222  |s  |  cos  u  <t  -  T  '  -t  '  +t  ) 
k=l  l-l  n-1  “  n  k  k  *.  *■ 


A. 5 


A. 6 


where  in  the  last  summand  the  term  corresponding  to  n  *  0  has  been  omitted . 
The  summation  over  n  can  be  approximated  by  an  integral  in  which  l3nl 
becomes  S(u),  the  signal  power  spectrum1,  thus 
T  M  M  2*w 

s,  *  —  2  2  f  S(m)co8  w(t  -t  '-t  +t  ')dw 

k«U=l  o  K  k  r  x 


A. 7 


The  argument  of  the  cosine  function  is  a  function  of  the  steering  angle  9 
which  depends  on  the  array  geometry.  Typically,  for  a  linear  array  in  which 
the  hydrophone  spacing  is  d 

t,  -  T,  '  -  r#  +  t  '  “  —  (k-£)(sin  0  -  sin  6')  A. 8 

IlC  tC  Xr  Hr  C 

where  c  is  the  velocity  of  sound,  6  is  the  direction  of  the  signal,  and  0' 
is  the  direction  in  which  the  array  is  steered.  For  other  array  geometries 
one  can  in  general  onlv  say  that 


A-17 


Tk-  V  * T*  -  V  “c  f(k’**6*e,) 


A. 9 


where  D  is  some  distance  parameter  (such  as  the  diameter  in  a  spherical  array) 
and  where  f(k,i,9,9‘)  is  a  dimensionless  function  having  the  following 
properties 

f(k,S.,9,9)  =»  0  for  all  k  and  i. 

A.  10 

f(k,k,8,9')  «  0  for  all  9  and  6' 

If  the  array  is  steered  approximately  in  the  signal  direction 
S’  *  9  +  A0,  where  A6  is  small,  then 

2 

- (k ,  1 , 9, 9’ )  »  A0  f'(k,i,9)  +  f"(k,i,9)  +  ...  A. 11 


where  f ' (k,i,6) 


f (k, 1,9,  8  +  A6)  etc. 

A9-0 


d(A9) 

If  A0  is  small,  and  if  f'(k,S.,9)  is  finite,  only  the  first  tern  of  this  series 
needs  to  be  retained,  so  that  for  small  A6  Eq.(A.7)  becomes: 


l  Z  2nW 

s.(A9)»-  11/  3(w)coslu£  f'  (k,i.,0)69]du> 

4  n  >  ,  n  1  C 

k»l  1=1  o 


A.  12 


It  is  now  also  possible  to  expand  in  terns  of  A6  around  A9  *  0.  If  only 

terms  up  to  second  order  are  retained  in  this  expansion,  then  in  view  of  A. 10 

2rn.T  ,2n2.  .2  MM  2 

s  ,  (A0)  »  —  /  S(^){M2  -  z  Elf'  (k,i,0)  j  )dw  A. 13 

4  2c2  k-1  1-1 

The  ratio  of  output  for  A6  4-  0  to  that  for  A9  »  0  is 


s,<A0)  n2„2  -  M  M  2 

i_  ,  j  .  LLM.  {t  r  E  [f  '  (k,4 ,8)  ] 2) 

5‘<0)  2,„  2<=2  mA-I*-! 

/  w2S(u)du 


where 


B 


2  o 


2irW 

f  S(u))dw 
o 


A. 14 


A.  15 


When  the  integrals  in  this  expression  converge  for  W  -*■  ®  B  is  frequently  taken 
ns  a  definition  of  the  signal  bandwidth  [6].  It  can,  of  course  be  evaluated 
it:  ,nn  explicit  form  for  5(ai)  and  a  value  for  u  are  known. 


A- 18 


If  the  beam  is  completely  off  target  f (k,i.,0,6')  is  presumably  quite 
large  and  therefore  the  integrand  in  A.  7  is  a  rapidly  oscillating  function 
for  all  k  +  H.  Significant  contributions  to  are  therefore  made  only  by 
those  terms  for  which  k  =  g,  with  the  result  that  for  the  beam  completely 


s4(A0)/s4(O)  takes  on  some  specified  value  between  Its  maximum  value  of 

1  1 

unity  and  its  minimum  value  of  —  .  We  take  this  value  to  be  rr;  this  is  a 

M  i 

satisfactory  value  for  all  M  >  2.  For  large  M  the  double  summation  in  A. 14 


is  a  large  number  and  therefore  the  value  of  A0  required  to  produce  a  value 
of  s4(AG)/s4(0)  of  ~  is  small.  Therefore  the  higher-order  terra  that  were 
omitted  in  A. 11  should,  in  fact  be  negligible  for  suf fin lent lv  large  M. 
Setting  A. 14  equal  to  y  results  in 


(AS)2 


p  2  1 

D“B 


2 

c _ _ _ 

MM  „ 

t  l  [f’(k,£.,9)]  } 
k-m=i 


A. IP 


The  beam  width  is  defined  to  be  equal  to  2A6;  thus 


Beam  width  =  2A6  =  - - - —  - - -  •  —  -  —  A.  .19 

I  M  M  2 

m  z  i  [f '(k,it,0)] 

J  k=l  =1 

For  simple  array  geometries  the  double  summation  appearing  in  this 
expression  can  be  evaluated  in  closed  form.  Thus  consider  a  linear  array 
in  which  the  hydrophone  spacing  is  d.  Then  letting  D  equal  the  nrrav  length, 
we  have  D  =  (tl  -  l)d,  and  we  see  from  F.qs.A.8  and  A. 9  that 
fCk.fc.e.e’)  ®  j~-  (k-il)(sin  6  -  3in  9').  Hence 


A-13 


f,(k>t«6)  =*  (k-i)cQS  e 


and  MM  _ 

I  E  If '(k, a, e)] 
k-1  4-1 

Hence  for  the  linear  array 
2/6c(M-l) 


2A9 


if  M  »  1. 


2  .  M  M  _ 

t  S  (k-4)2 
(M-l)  k-1  4-1 


2>^6c 


2  4  2 

cos  9  M-M. 

(M)2  6 


A.  20 


DB/M2-!  cos  6  MB  d  cos  9 


The  fact  that  this  expression  becomes  infinite  for  6  -  -  ^  is  a  reflection 

of  the  fact  that  in  the  end-fire  direction  the  first-order  term  in  A. 11 

vanishes,  so  that  the  quadratic  term  should  be  used.  This  is  a  peculiarity 

of  the  linear  array  which  does  not  occur  with  other  arrays. 

The  number  of  distinct  beams  is  most  reasonably  defined  as 

N _ 21 _ 

average  beam  width 

However,  in  order  to  avoid  the  complication  introduced  by  the  infinity  that 
occurs  in  Eq.A.20  for  6  =  +  1  we  obtain  an  estimate  of  N  for  the  linear 
array  by  use  of 

N  -  2t  (average  reciprocal  beam  width) 

B(D/c)cos9 

The  average  reciprocal  beam  width  is  the  average  of  -  over  the 


I'/b 

Thus  for  a  linear  array 


interval  -  x  <  9  <  and  it  is  equal  to  — ■ 

2  2  vfb 

with  hydrophone  spacing  d,  the  number  of  distinct  beams  is  approximately 

N  =  —  Bit  —  A.  21 

rr  c 

y  6 

Typically,  we  can  take  d  =  2  ft  c  =  5000  ft/sec  and  B  *  2v  x  5000  rad/sec. 


giving  8 

N  =  —  M  ;  10M  A. 22 

Jb 

Another  simple  geometry  for  which  A. 19  can  be  put  Into  a  closed  form  is 
t!ie  circular  array  with  an  even  number  of  equally  spaced  hydrophones. 


A-20 


-*5  v 


Assume  that  the  nominal  signal  wavefront  is  perpendicular  to  one  of  the  major 
diameters  and  consider  a  small  displacement  A8  away  from  this  nominal 
direction.  Then  if  D  is  the  major  diameter  it  can  be  shown  that 


f'(k,i,0)  -  jsin  ^  (k-i)  |  cos  ^  (k+i,  -2) 

For  sufficiently  large  M  one  can  replace  the  double  summation  by  a  double 
integration: 

MM  _  ,  _  w2  *  n  9  2  M2 

l  Z  sin  ^  (k-£)cos/  p:  (k+i-2)  =  — j  /  /  sin  (x-y)cos  (x+y)dxdy  =  — 

^=1  2.  ^  n  TT  O  O 

A. 22a 

Thus  Eq.A.21  becomes 

2Ae  .  ||  A.23 

This  appears  to  be  independent  of  M,  however  for  constant  interphone  distance 
D  is  a  function  of  M;  in  fact  for  the  circular  array,  with  M  large,  f)  t  Md/ir. 
(This  follows  since  for  large  M,  Md  is  approximately  the  circumference  of 
the  circle).  Hence  Eq.A.23  becomes: 


and  the  number  of  distinct  beams  is 

N  =  ~  BM d/c  A-25 

These  expressions  are  very  similar  to  the  corresponding  ones  for  the  linear 
array,  Eqs,  A. 20  and  A. 21. 

The  dependence  of  N  on  M  is  seen  to  he  a  direct  consequence  of  the 
fact  that  both  the  linear  and  circular  arrays  are  one-dimensional,  so  that 
for  constant  interphone  distance  D  is  proportional  to  M.  This  dependence  is 
different  for  arrays  in  which  the  hydrophones  are  distributed  over  an  area 
or  a  volume , 

The  simplest  example  of  an  area  distribution  is  an  array  in  which  the 
hydrophones  are  equally  distributed  over  a  square.  Such  a  distribution,  for 


A-21 


M  •  9  ii  shown  in  Pig.  A.2  . 


t 

D 


Figure  A.2:  Square  hydrophone  array. 

M  M 

Evaluation  of  E  I  lf'(k,£,6)}  is  somewhat  tedious,  but  essentially 

k-1  !»1 

straight  forward.  It  turns  out,  rather  surprisingly,  that  the  result  does 
not  depend  on  e,  and  ia,  in  fact  exactly  equal  to  M2/6.  Also,  D  -  (/fc-Dd. 
Hence,  by  use  of  Eq.  A, 19  the  beam  width  is  given  by 


2A8 


2 /6  c 
DB 


2 /6c 

(^t-l)Bd 


and  therefore  the  number  of  beams  is 


A. 26 


N  „  U5H  „  w(/S-l)Bd  A  2? 

/6c  •'’6c 

Note  that  in  terms  of  D,B,  and  c  this  result  is  essentially  the  same  as  that 
obtained  for  the  linear  array  (Eq.A.20).  Hence  in  going  from  a  one¬ 
dimensional  to  a  two  dimensional  array  the  major  change  is  in  the  deoendence 
of  D  on  M. 

From  dimensional  arguments  this  conclusion  can  be  extended  to  other 

two  dimensional  arrays  such  as  a  spherical  array  with  hydrophones  only  on 

the  surface.  For  all  such  arrays 

2 A0  «  — -£~  A. 28 

/KBd 


where  the  dependence  on  l/»^f  is  approximate  and  holds  for  large  M.  In  addition 


for  a  volume  distribution  it  is  expected  that 
c 


2A9  * 


M^Bd 


A. 29 


Th-_  factor  of  proportionality  in  ail  of  these  cases  appears  to  be  on  the  order 
of  10  or  leas. 


A-22 


Appendix  B 


Characteristics  of  Delay  Lines  Required  In  Array  Processors 

In  this  appendix  we  examine  the  total  time  delay,  number  of  taps,  and 
delay  between  taps  of  the  tapped-delay  lines  required  to  steer  the  array 
over  360°  in  azimuth.  Me  consider  initially  the  linear  array  with  uniform 
spacing  between  hydrophones  for  the  sake  of  mathematical  simplicity;  these 
results  are  then  extended  to  other  arrays  with  suitable  modifications. 

Consider  a  simple  array  processor  of  the  form  shown  in  Figure  A.l.  Me 

assume  that  the  signal  from  each  hydrophones  is  applied  to  a  tapped  delay 

line  and  that  the  delays  -t,',  — t. 1 , . .  ,-t  '  are  obtained  by  taking  the  out- 

1  Z  M 

put  from  the  proper  tap  in  each  channel.  We  assume  that  the  taps  are 
equally  spaced  along  the  line,  that  the  time  delay  between  adjacent  taps  is 
At  and  that  the  total  number  of  taps  is  K.  Thus  the  maximum  delay  that  can 
be  obtained  from  any  line  is  KAt.  It  is  assumed  that  all  the  M  delay  lines 
are  identical. 

If  a  linear  array  is  steered  in  the  broad-side  direction  all  the  delays 
a,re  equal,  and  we  may  as  well  assume  that  the  delays  are  zero,  i.a.  the 
outputs  of  the  first  tap  on  each  line  are  connected  to  the  summer.  Suppose 
now  that  we  wish  to  steer  the  array  away  from  the  broad-side  direction  by 
the  angle  A0.  The  smallest  value  of  A0  is  obtained  by  making  the  delay  of 
the  i^  hydrophone  equal  to  (1-1)At;  i.e.  on  the  first  delay  line  we 
connect  to  the  first  tap,  on  the  second  delay  line  to  the  second  tap,  etc. 
For  a  linear  array  with  uniform  hydrophone  spacing  d  the  difference  in 
time  delay  between  the  i  and  j  hydrophone  is  given  by 

t  -  t=  (i  -  j)  —  sin  0  A. 30 

i  j  c 


A-23 


where  c  la  the  velocity  of  sound  and  6  is  the  angle  between  the  wave  front  and 
the  array  axis.  For  small  8  near  8  -  taro  sin  0  :  0  -  A8.  Thus,  since  for 


adjacent  hydrophones  the  minimum  value  of  is  At,  we  have 

At  -  -  68  ,  A. 31 

c  min 

The  minimum  value  of  A0  obtainable  from  a  linear  array  with  M  hydrophones 
is  given  by  Eq.  A. 20  of  Appendix  A.  It  seems  reasonable  to  design  the  system 
in  such  a  way  that  this  minimum  is  matched  to  the  minimum  obtainable  due  to 
the  limitations  Imposed  by  the  finite  number  of  caps  available  on  the  tapped 


delay  lines.  Thus  we  get 

2/6 
BM 


At 


A. 32 


(Note  that  we  are  actually  equating  the  A8  ^  of  A. 31  to  2A0  of  Eq .  A. 20; 
however  this  is  consistent  with  the  definition  of  the  number  of  distinct  beams 
in  Appendix  A) . 

In  order  to  steer  the  beam  into  the  end-fire  direction  the  delays  between 
adjacent  hydrophones  must  be  made  equal  to  ~i  thus  the  maximum  amount  of 
delay,  required  at  the  last  hydrophone,  is  (M  -  l)d/c.  Therefore  the  number 
of  taps  on  each  delay  line  must  be 
(M-l)d/c  ®  t  MC1'1) 

K  -  -  ■  /c  -  — £ -  A. 33 


At  2/6 

Using  tho  same  typical  values  as  in  Appendix  A,  i.e.  B  -  2mx  5000,  d  *  2, 

c  =  5000,  we  obtain 

1.55xlO-4 
At  - - 


M 

and  K  -  2.67  M(M-l) 

If  M  ■  100,  At  =  1.55  usee  and  K  *  26400  caps. 

For  other  array  geometries  a  relation  such  as  A.  31  will  generally  hold, 
.  wpt  that  if  the  spacing  is  not  uniform  d  should  be  the  smallest  inter- 
hvdrop^one  spacing.  Then  using  Eq.  A. 19  we  obtain  in  general 


At 


2dM 


/M  11 

bd/  e  i  tf '  (k,«  ,e)]‘ 
Vk»l£=  1 


A. 34 


Also,  since  the  maximum  delay  required  is  in  general  D/c,  the  number,  R,  of 
taps  on  the  delay  line  nust  be 

.'5  M  _ 

BD2.  /  E  I  [f '  (k.i.e)  ]  ]2 
D/C  Vlrsl  J>  =  1 

K  -  —  * - — -  A. 35 

Ax 


2cC:l 


As  is  shown  in  Appendix  A  the  expression  under  the  square  root  is  generally 

2  2 
proportional  to  M  ;  c.g.  ,  for  the  circular  array  it  is  M  / 4.  Thus  it  appears 


to  be  generally  true  that 

.2 


K  « 


BD 

2cd 


A,  36 


with  the  factor  of  proportionality  probably  on  the  order  of  unity.  For  one- 

dimensional  arrays,  such  as  lines  or  circles,  D  is  proportional  to  Md,  hcncc 

2 

K  is  approximately  proportional  to  BM  d/c.  For  two  dimensional  arrays  D  In 


proportional  to  /id;  therefore  K  is  proportional  to  ’’Md/c.  For  volume 
distributed  arrays,  K  would  be  proportional  to  — — - Similarly,  the 


Incremental  delay  At  is  inversely  proportional  to  BM  for  one-din ens tonal 

h  1/, 

arrays  to  BM*  for  two  dimensional  arrays  and  to  BM  3  for  three-dimensional 

arrays . 


A-25 


ADAPTIVE  ARRAY  PROCESSORS 


by 

John  H.  Chang 
Franz  B.  Tuteur 

Progross  Report  No.  39 


<;«.ner  :L  Dynar>ic/Electric  Boat  Research 


April  1969 

I  >  I  PA  K  I  M  IN  I  Ol  ENGINEERING 
AND  APPLIED  SCIENCE 

YALE  UNIVERSITY 


SUMflARY 


This  investigation  is  ccncer  ed  with  the  design  and  analysis  of 
an  adaptive  array  processor  in  hich  the  individual  filters  consist  of 
tapped-delay  lines  and  adjusted. In  gains.  Convergence  propert; :*a  of  the 
iterative  procedures  are  considered  and  the  performances  in  filtering 
as  well  as  in  detection  are  determined  analytically. 

Chapter  I  presents  the  background  and  description  of  the  problem 
to  be  considered.  Chapter  II  describes  the  structure  of  tapped-delay- 
line  filters  in  an  array.  The  effect  of  misadjustment  and  the  relation¬ 
ship  between  mean  squared  error  and  the  number  of  delay  elements  are 
discussed. 

In  Chapter  III  the  design  of  adaptive  tapped-delay-line  filters 
is  formulated.  The  method  of  stochastic  approximation  and  raean- 
3quared-error  criterion  are  employed  to  adjust  the  gains  automatical ly. 
It  is  shown  that  it  is  not  necessary  that  the  desired  signal  generally 
used  to  obtain  the  error  function  be  available.  Either  signal  or  noise 
correlation  functions  will  suffice  to  generate  the  error  gradient. 
Problems  basic  to  all  adaptive  processes  such  as  the  conditions  for 
convergence,  rate  of  convergence,  choice  of  the  weighting  sequence 
are  answered  with  explicit  expressions.  Adaptation  in  a  nonstationary 
environment  is  considered  in  Chapter  IV  using  algorithms  derived  from 
the  Kalman  filtering  techniques  and  dynamic  stochastic  approximation 
methods. 

In  Chapter  V  one  approach  to  the  design  of  an  optimum  adaptive 
array  detection  system  is  considered.  Use  is  made  of  the  convergence 
properties  of  adaptive  tapped-delay-line  filters  and  the  properties  of 
likelihoed-ratio  detectors  for  the  case  of  Gaussian  input  processes  and 
low  input  signal- to  noise  ratios.  This  approach  is  especially  useful 
when  the  received  waveforms  are  disturbed  by  strong  but  unknown  noise 
sources.  The  performances  of  the  proposed  adaptive  detector  are  analyzed 
for  bandlimited  processes.  The  output  signal-to  noise  ratio  and 
directivity  patterns  are  evaluated  and  compared  with  those  of  the 
nonadaptive  systems. 

In  Chapter  VI  results  obtained  from  digital  computer  simulations 
are.  presented  to  check  the  afore-mentioned  analyses  using  both  actual 
sonar  signals  end  data  generated  from  random  numbers. 


B-i 


TABLE  OF  CONTENTS 


SUMMARY 

ACKNOWLEDGEMENT 

LIST  OF  FIGURES  AND  TABLES 


CHAPTER  ONE  INTRODUCTION 


1.1 


The  General  Problem 


1.2  Adaptive  Filters,  Detectors,  and  State  of  the  Art 

1.3  Problem  Statement  and  Objectives 


CHAPTER  TWO  GENERAL  FORM  OF  THE  ADAPTIVE  PROCESSOR 


2 .  i 


2.2 


2.3 


2.4 


2.5 

CHAPTER  THREE 


Signal  and  Noise  Models 

The  Structure  of  the  Receiver 

Tapped-Delay-Line  Filters  in  an  Array 

The  Tapped-Delay-Line  Filters  and  the  Wiener  Filters 

The  Effect  of  Interference  on  the  Processor  Structure 

THE  ADAPTIVE  MECHANISM 


3.1 


3.2 

.3.3 


3.4 


3.5 

chapter  four 

4.1 

4.2 


4.  3 


Introduction 

Methods  of  Stochastic  Approximation 
The  Design  of  Adaptive  Tapped-Delay-Line  Filters 
Convergence  Properties  of  the  Adaptive  Tapped-Delay- 
Line  Filters 

Further  Remarks  on  the  Operation  of  the  Proposed  System 
,'lD.\P’.  AT  ION  IN  A  NON  STATIONARY  ENVIRONMENT 
Introduction 

Application  cf  the  Method  of  Dynamic  Stochastic 
Approximation 

application  of  the  Kalman  Filtering  Techniques 
Nonst ationsrity  and  the  Use  of  Ordinary  Methods  of 
Stochastic  Approximation 


15-iii 


CHAPTER  FIVE 

5.1 

5.2 

5.3 

5.4 

5.5 

CHAPTER  SIX 
6.1 
6.2 

6.3 

6.4 

CHAPTER  SEVEN 

7.1 

7.2 

APPENDIX  A 
APPENDIX  B 
APPENDIX  C 
APPENDIX  D 

APPENDIX  E 
APPENDIX  E 
REFERENCES 


PERFORMANCE  ANALYSIS  OP  THE  ADAPTIVE  RECEIVER 

Introduction  and  Assumptions 

Statistics  of  an  Array  Processor 

Initial  Behavior 

Final  Behavior 

Adaptive  Behavior 

COMPUTER  SIMULATIONS  AND  NUMERICAL  EXAMPLES 

Introduction 

Computer  Simulations 

Experimental  Results 

Numerical  Computations 

SUMMARY,  CONCLUSION,  AND  SUGGESTIONS  FOR  FUTURE  RESEARCH 
Summary  and  Conclusion 
Suggestions  for  Future  Research 

THE  OPTIMUM  DETECTOR  FOR  DETECTION  OF  A  GAUSSIAN  SIGNAL 

PROOF  OF  THEOREM  1 

SOME  PROPERTIES  OF  GAMMA  FUNCTION 

EFFECT  OF  UNCERTAIN  SIGNAL  POWER  ON  THE  FINAL  VALUES 
OF  THE  GAINS 

GENERAL  DYNAMIC  METHODS  OF  STOCHASTIC  APPROXIMATION 
SUMMARY  OF  KALMAN  FILTERING  TECHNIQUES 


B-iv 


LIST  OF  FIGURES  AND  TABLES 


Figure  1 
Figure  2 
Figure  3 
Figure  4 
Figure  5 
Figure  6 
Figure.  7 


Signal  Model 
Noise  Model 

General  Array  Processor 
Tapped-Delay-Line  Filters  in  an  Array 
Impulse  Response  of  an  Optimum  Filter 
Structure  of  Tapped-Delay-Line  Filters 
Adaptive  Mechanism 


Table  1 
Figure  3 
Figure  9 
Figure  10 

Figure  11 
Figure  12 
Figure  13 

Figure  14 
Figure  15 
Figure  16 
TibK:  2 
Table  3 
Figure  17 
Figure  1 • 


Comparison  of  Filter  Coefficients 
Variation  of  Mean-Squared  Error 
Variation  of  Filter  Coefficients 
Variation  of  Mean-Squared  Error  versus  the 
Weighting  Sequence 
Effect  of  Uncertain  Signal  Power 
Comparison  of  the  Bates  of  Convergence 
Minimum  Mean-Squared  Error  versus  the  Number 
of  Tups 

Variation  of  Detector  Output 
Noise  Correlation  Functions 
Signal  Correlation  Function 

N'n; - rross-Crrrelation  Coefficients 
Experiment.!  Results 

Variation  of  Normalized  Output  S i gnal- to-Noise  Ratio 
V  ’ r i  ’.1  i  -  r.  c  f  Directivity  Patterns 


13— \ 


CHAPTER  ONE  INTRODUCTION 


1.1  The  General  Problem 

The  problem  of  designing  a  linear  device  to  eliminate  noise  or 
to  predict  the  future  behavior  of  an  incoming  signal  was  considered 
by  Wiener  [1]  more  than  twenty-five  years  ago.  Wiener  filters  are 
optimal  in  the  least  square  sense  for  stationary  signals .  More  recent 
work  by  Kalman  and  Bucy  [2]  has  led  to  the  design  of  optimal  time- 
variable  linear  filters  for  certain  kind  of  non-stationary  signals. 

For  such  signals,  Kalman-Bucy  filters  can  deliver  substantially  better 
performance  than  Wiener  filters. 

Both  the  Wiener  and  the  Kalman-Bucy  filters  must  be  designed  on 
the  basis  of  a  priori  or  assumed  knowledge  about  the  statistics  of  the 
input  (useful  signals  and  noises)  to  be  processed.  These  filters  are 
optimum  in  practice  only  when  the  statistical  properties  of  the  actual 
input  signals  match  the  a  priori  information  on  the  basis  of  which  the 
filters  were  designed.  When  the  a  priori  information  Is  not  known 
perfectly,  the  filters  will  not  deliver  optimal  performance.  The 
concept  of  adaptive  filters  has  been  developed  to  solve  such  problems. 

An  adaptive  filter  can  adapt  itself  to  changing  operating  conditions. 
These  changes  may  be  due  to  variations  in  the  input  signals  or  the 
internal  structure  of  the  filter.  Adaptation  is  accomplished  by  ob- 
sevation  of  the  reaction  of  the  filter  to  an  external  signal  or  to  an 
internal  variation  with  subsequent  goal-directed  variation  of  the  filter 
parameters  so  that  some  quality  criterion  is  minimized. 


B-l 


There  are  several  criteria  for  optimization  of  a  processor  for 
an  array  of  sensors  such  as  hydrophones.  Farran  and  Hills  [3]  have 
used  the  criterion  of  maximization  of  array  gains  to  design  real 
weightings  for  individual  sensors.  With  a  similar  approach  Mermoz  [4] 
has  been  concerned  with  the  optimum  utilization  of  an  array  for 
separation  of  a  signal  of  known  waveform  from  noise.  Wiener  [1]  used 
the  criterion  of  minimizing  signal  distortion  to  design  filters.  Burg 
et  al  [5]  developed  a  theory  for  spatial  processing  of  seismometer  arrays 
based  on  the  Wiener  least-squared-error  criterion.  Bryn  {6]  used  the 
evaluation  of  the  Neyman-Pearson  likelihood  ratio  to  minimize  risk, 
whereby  a  theory  of  optimum  signal  processing  has  been  developed  for 
three-dimensional  arrays  operating  on  Gaussian  signals  and  noises.  All 
of  the  above  contributors  were  concerned  with  matrix-inversion  techniques 
for  the  optimum  solution  to  the  array  processing  problems.  Edelblute, 

Fisk,  and  Kinneson  [7]  have  shown  that  the  above  criteria  yield 
equivalent  results  at  a  single  frequency.  Performance  comparison  between 
optimum,  suboptimum,  and  conventional  detection  systems  under  different 
operating  situations  has  been  made  recently  by  Schultheiss  and  Tuteur 
[8,9,10] . 

When  the  noise  or  signal  distribution  is  not  perfectly  known,  the 
af or 2 -mentioned  detection  methods  present  two  major  difficulties.  If  the 
underlying  statistics  are  unknown,  the  previous  techniques  cannot  be 
used,  if  they  are  incorrectly  assumed,  the  consequent  detector  performance 
can  be  absurd. 

Since  adaptive  filters  can  be  constructed  with  only  partial 
knowledge  about  the  system  and  filters  can  be  incorporated  to  realize 
most  detection  systems,  adaptive  detectors  can  be  designed  in  a  similar 


13-2 


fashion.  In  this  study  one  approach  to  the  design  of  an  optimum  adaptive 
array  detection  system  is  considered.  Use  is  made  of  the  convergence 
properties  of  adaptive  tapped-delay-line  filters  and  the  properties  of 
likelihood-ratio  detectors  for  the  cases  of  Gaussian  input  processes 
and  low  signal-to-nolse  ratios.  This  approach  is  especially  useful  when 
the  received  waveforms  are  disturbed  by  strong  but  unknown  noise  sources. 


1.2  Adaptive  Filters,  Detectors,  and  State  of  the  Art 


Considerable  Interest  has  been  expressed  recently  in  the  applica¬ 
tion  of  adaptive  filters  to  communication  problems.  Uidrov  [11]  and 
Gabor,  et  al  [12]  have  independently  investigated  and  constructed  systems 
that  "learn"  or  adjusc  themselves  to  stochastic  signals  in  order  to 
minimize  error  rower.  Both  compare  a  filtered,  noisy  signal  with  a  noise- 
free  signal  to  obtain  the  error.  The  mean-sqare  error  as  a  function 
of  certain  of  the  filter  parameters  is  a  high-order  narabolic  surface. 
These  parameters  are  adjusted  according  to  surface  searching  procedures 
for  minimum  error.  Gabor  and  Vidrow  each  have  constructed  their  self- 
organizing  systems  in  the  form  of  a  highly  specialized  analog  computer. 
Bucy  and  Follin  [3]  suggested  an  adaptive  filter  which  measures  the 
spectral  densities  of  the  imput  signal  and  noise  processes  and  adjusts 
its  band-pass  to  give  optimum  filtering  in  the  Wiener  sense. 

Narendra  and  McBride  [4]  described  an  optimization  technique  which 
is  applicable  to  filtering  problems.  The  change  in  each  parameter  is 
determined  from  an  error  gradient  in  parameter  space  computed  by  cross¬ 
correlation  methods  which  are  independent  of  signal  spectrs  and  require 
no  test  signal  or  parameter  perturbation.  This  method  works  if  either 
the  noiseless  signal  Is  available  or  the  signal  SDectrum  is  known.  Some 
averaging  operation  is  performed  to  obtain  the  parameter  increments. 

More  recently  '.Jidrov  [15,16]  analyzed  an  adaptive  filter  consisting 
of  t apped-delav-lir.e  and  adjustable  gains.  The  adaptive  algorthm  was 
obtained  through  heuristic  reasoning  rather  then  mathematical  rigorous- 
i.oss .  Some  approximate:  methods  were  given  to  estimate  the  rate  of 
adaptation,  ef  feet  of  mi  -,ad  [ustmenis  ,et  c  .  However,  a  noise-free  signal 
or  ainulated  signal  is  required  to  adjust  the  gains. 


« 


A  number  of  authors  have  applied  "adaptive"  techniques  to  the 
problems  of  detecting  signals  in  the  presence  of  noise.  The  problem  of 
designing  an  adaptive  filter  for  a  fixed  waveform  whose  time  arrival 
is  unknown  has  been  considered  by  Glaser  (18}.  In  his  work  a  statistical 
decision  theory  approach  is  used.  Local  waveform  uncertainty  is  expressed 
la  terms  of  an  a  priori  probability  density  function  but  recurrence  time 
uncertainty  is  not.  The  epoch  is  instead  detected  on  a  local  basis  and 
the  assumption  is  made  that  epoch  measurement  is  accurate. 

Jakowatz,  Shuey,  and  White  [19]  have  proposed  an  adaptive  filter 
for  detecting  a  recurrent  fixed  waveform.  The  basic  operations  are  : 

(1)  comparison  of  a  sample  of  the  incoming  waveform  with  an  estimate 
of  the  unknown  signal,  (2)  correlation  of  these  two,  (3)  on  the  basis 
of  the  correlator  output,  guess  whether  or  not  a  signal  is  contained 
in  the  current  sample  of  the  incoming  waveform,  (4)  at  those  times 
when  a  signal  is  guessed  to  be  present,  form  a  new  estimate  of  the 
signal  which  consists  of  a  weighted  average  of  that  sample  of  the  input 
with  the  prior  estimate. 

Although  basic  .guidelines  from  signal  detection  theory  are  used  in 
the  adaptive  filter  of  Jakowatz  and  et  al,  the  design  approach  is 
not  an  optimum  one  as  the  authors  indeed  recognized.  Two  characteristic 
features  are  apparent  in  this  adaptive  filter.  First,  a  local  detection 
is  required  before  any  modification  of  the  memory  is  made.  Secondly, 
the  receiver  memory  is  used  to  remember  a  single  waveform.  This  is 
undoubtedly  an  inadequate  memory  for  the  receiver  to  be  optimum.  Their 
adaptive  filter  may  be,  however,  a  practical  receiver  when  the  local 
waveform  signal-to-noise  ratio  is  large  enough  to  permit  good  local 


B-5 


detection.  In  such  a  case  the  simple  implementation  of  a  receiver 
with  a  single  waveform  memory  may  Justify  Its  suboptlmum  detection 
performance. 

Daly  [20]  and  Scuddcr  (21]  have  considered  a  local  detection 
problem  in  which  a  fixed  local  waveform  recurs  in  a  synchronous 
manner.  In  the  local  detection  case  the  problem  becomes  that  of 
detection  where  each  of  the  local  waveform  recurrences  are  using  all 
th-j  past  Information.  The  approach  Is  Bayesian  and  one  of  optimum 
receiver  design.  One  central  problem  is  common,  however,  and  that 
is  the  problem  of  implementing  an  optimum  receiver  which  requires 
an  exponentially  growing  memory.  As  Scudder  (21]  pointed  out,  the 
standard  nonsequential  realization  of  the  optimum  receiver  Is  very 
complex,  grows  exponentially  with  time,  and  the  analysis  of  Its 
performance  Is  close  to  impossible,  even  using  present  day  computers. 

In  detection  problems,  the  primary  goal  is  to  decide  between 
two  hypotheses:  presence  of  signal  plus  noise  or  noise  alone.  If 
cn«  prefers  correct  decision  to  mistakes,  Peterson,  Birdsnll,  and 
Fox  [22]  have  shewn  that  the  optimum  receiver  is  one  *-.'hich  realizes 
the  likelihood  ratio  of  the  observation  and  this  fact  does  not  depend 
on  .my  specific  quantity  to  be  maximized  or  minimized. 

Tb.  likelihood  ratio  plays  u  central  role  in  the  design  of 
adaptive  receiver  realization  ns  it  did  in  the  design  of  optimum 
receivers  in  classical  detection  theory.  The  adaptive  receiver 
realization  in  this  report  is  obtained  by  constructing  Wiener 
filters  for  each  sensor  output,  cascading  the  sum  of  these  filters 
with  thu  inverse  square  root  of  the  signal  snectrum  density,  then 
squaring  ad  averaging.  Since  the  Wiener  filters  are  approximated 

B-ti 


Best  Available  Copy 


by  tapped-delay  lines  and  realized  adaptively,  the  detector 
implementation  is  very  simple  and  practical.  For  Gaussian 
processes  and  low  signal-to-noise  cases,  the  proposed  system 
will  asymptotically  form  a  likelihood  ratio  detector  and  at 
the  same  time  the  output  signal-to-noise  ratio  is  maximized. 

1.3  Problem  statement  and  objectives 

The  problem  considered  in  this  study  ia  the  passive  detection 
of  a  noise-like  signal  waveform  generated  by  a  source  located 
in  a  known  direction  from  the  receiver.  Typical  applications  of 
this  general  problem  can  be  found  in  sonar  detection,  seismic 
detection  and  radio  communications.  The  sonal  application  is  the 
one  that  primarily  motivates  this  study,  and  examples  will  be 
taken  from  the  sonar  area.  In  order  to  take  advantage  of  the  known 
directivity  of  the  target  signal,  a  directional  receiver  in  the 
form  of  an  array  is  employed  to  distinguish  signal  from  noise.  In 
the  sonar  application,  the  receiver  consists  of  an  array  of  hydro¬ 
phones,  together  with  an  appropriate  processor.  Generally  speaking, 
the  processor  consists  of  individual  filters  on  each  sensor  output, 
a  summer,  a  post-summation  filter,  a  square-law  device  and  an 
averaging  filter.  The  output  of  the  averaging  filter  is  used  to 
indicate  the  presence  of  a  target  signal. 

In  the  absence  of  a  target  signal  the  averaging  filter  output 
is  the  result  of  noise  waveforms  nicked  up  by  the  array  elements. 
The  noise  is  partly  far-field  noise  and  partly  locally  generated. 
The  far-field  noise  is  often  assumed  to  be  directionally  isotropic; 
however,  there  may  also  be  directional  noise  sources.  These  direc¬ 
tional  noise  sources  are  referred  to  as  Interference  sources;  while 


the  directionally  isotropic  component  is  referred  to  as  ambient 


noise.  In  the  absence  cf  inturf erence  noise,  detection  of  a  target 
signal  can  be  based  simply  on  the  presence  of  n  directional  compon¬ 
ent  in  the  received  signal.  However,  if  interference  sources  can 
be  expected  to  be  present  in  the  noise  field,  than  it  is  necessary 
to  define  the  target  in  some  way  to  distinguish  it  from  the  direc¬ 
tional  noise  components. 

The  research  described  herein  is  concerned  with  developing  a 
system  for  processing  the  outputs  of  a  passive  array  of  hydrophones 
under  the  following  assumptions: 

1)  Target,  interferences,  ambient  noise  and  local  noise  are  assumed 
to  be  gauss lan  random  processes. 

2)  The  sum  of  interferences  and  ambient  noise  are  regarded  as  the 
effective  noise,  which  ie  assumed  to  be  statistically  independ¬ 
ent  of  the  target  signal. 

3)  The  target-signal  component  s^t)  observed  at  the  output  of 
the  1th  hydrophone  is  a  linear  time-invariant  transformation 
cf  d(t) ,  the  target -signal  component  observed  at  the  output  of 
an  idaal  isotropic  hydrophone  located  at  the  origin  of  the 
coordinates.  The  target  direction  is  known,  together  with  its 
autocorrelation  function  (but  not  necessarily  its  power  level). 

4;  The  statistics  of  the  noise  field  are  completely  unknown.  Inter¬ 
ferences  may  be  present,  but  this  is  not  known.  If  they  are 
present,  their  directions  arc  unknown. 

5)  The  wavefronts  of  target  ani  interferences  are  regarded  as 
plane  over  the  dimensions  of  the  receiving  array. 

6)  The  processor  is  a  directional  array  whose  gain  is  maximized 
in  the  direction  from  which  the  target  is  expected  to  come. 


H-.i 


Since  the  processor  is  to  be  designed  in  such  a  way  that  it  can  be 
easily  implemented  and  be  able  to  operate  veil  in  real  time  in  the 
presence  of  unknown  noise  field,  adaptive  techniques  mast  be  employed. 
The  system  proposed  here  consists  of  an  adaptive  linear  multichannel 
filter  and  algorithms  for  iterative  adjustment  of  the  filter  coeffi¬ 
cients  on  the  tapped-delay  lines.  A  new  philosophy  is  Introduced 
here  for  designing  adaptive  algorithms  using  the  methods  of  stochastic 
approximation.  This  philosophy  allows  any  given  partial  information- 
e.g.,  the  correlation  functions  between  the  wavefront  and  various 
delayed  signals  -  to  be  incorporated  directly  into  the  weight- 
adjustment  procedure. 

This  information  is  completely  specified  once  the  spectrum  and 
the  direction  of  the  target  are  known.  '  Since  this  term  appears  in 
the  adjustment  formula,  a  space-time  filter  optimum  in  a  predeter¬ 
mined  direction  is  produced.  This  filter  is  supposed  to  reduce 
disturbances  coming  from  ocher  directions.  When  a  signal  appears 
in  the  expected  direction,  a  maximum  response  will  show  on  a  display 
device.  The  average  bearing  response  can  be  obtained  from  a  plot 
of  the  averaged  squarer  output  versus  the  looking  angle  of  the  array. 
In  most  practical  situations  a  narrow  peak  is  considered  to  be  the 
target . 

Convergence  properties  of  these  algorithms  arc  investigated  both 
analytically  and  using  simulation  experiments  as  examples.  The 
variations  of  error  variance,  signal-to-noise  ratio,  and  directivity 
patterns  during  and  after  the  adaptation  period  are  determined. 


B-t) 


CHAPTER  TWO 


GENERAL  FORM  OF  THE  ADAPTIVE  PROCESSOR 
2 . 1  Signal  and  Noise  Models 

Let  us  consider  an  array  of  K  omni-directional  hydrophones.  If  both  target 
signal  and  noise  are  present  at  each  hydrophone,  the  total  signal  received  by  the 
ith  hydrophone  is 

*±<0  -  s^t)  +  n±(t)  (2.1-1) 

where  s^(t)  is  the  signal  component  and  n^(t)  is  the  noise  component.  It  is 
assumed  in  all  cases  that  the  signal  originates  from  a  source  sufficiently  remote 
from  the  hydrophone  array  so  that  the  wavefront  is  essentially  plane  over  the 
dimensions  of  the  array.  This  assumption  also  neglects  distortions  due  to 
surface  scattering  and  other  propagation  effects.  Let  d(t)  be  the  signal  received 
by  a  hypothetical  hydrophone  situated  at  an  arbitrary  reference  point  in  the  array. 
d(t)  is  assumed  to  be  a  member  function  of  a  zero-mean  gaussian  random  process. 

If  the  array  and  its  housing  were  acoustically  transparent,  the  signal  component 
at  the  ith  hydrophone  is  s^t)  »  dCt-T^) ,  where  t  represents  the  propagation 
delay  between  the  ith  hydrophone  and  the  reference  point.  Then  the  signals  at  all 
hydrophones  can  be  represented  by  the  vector 

s(t)  =  [dCt-T^)  d(t-T2)—  d(t-TR)]  (2.1-2) 

where  K  is  a  constant  denoting  the  number  of  hydrophones  in  an  array.  If  this 
expression  is  Fourier  transformed,  there  results 

_s(u)  “  D(u)  (w) 

T  ,  Jwt, 

where  a  (u;)  =  '  e  •••  e  •  (2.1-3) 

and  where  D(w)  is  the  Fourier  transform  of  d(t). 

Let  the  spectral  density  of  the  reference  signal  be  $^(w) .  Then  the  signal 
field  may  be  represented  by  the  cross-spectral  density  matrix 

I  (w)  =  i,  atui)  a  T  (w)  (2.1-4) 

— ss  d  —  — 


B-10 


t 


PROPAGATION 


s,(t) 


Sc(t) 


SKV 


Figure  1.  Signal  Model 


OTHER 


NOISE 


Figure  2.  Noise  Model 


B-ll 


The  model  used  for  generating  the  observed  signals  is  shown  in  Fig.  1. 

The  noise  background  is  also  assumed  to  be  gaussian  and  to  consist  of  ambient 
noise  with  power  density  matrix  $^(0 )  plus  interferences  with  power  soectral 
density  ^(m),  i  -  1,2,  ,  L.  Assuming  that  all  the  interferences  are  statis¬ 

tically  independent,  the  total  noise  background  is  then 


inn 

nn 


v‘>  + 


(2.1-5) 


In  case  the  ambient  noise  is  independent  from  hydrophone  to  hydrophone  and 
has  power  spectral  density  $  (w)  at  each  hydrophone,  and  if  there  is  only  a 
single  interference  present  with  spectral  density  4^(w)  ,  then  Eq.  (2,1-5) 
reduces  to 


2^n(u))  -  $o(w)  +  4ij(u))  b_(u)  b_  (u) 


(2.1-6) 


where  I_  is  the  unity  matrix  and 
b  (w)  =  [e  e  . . .  e 


(2.1-7) 


is  composed  of  the  appropriate  delay  for  each  hydrophone  to  steer  the  array 
conventionally  at  this  single  interference.  The  noise  model  is  shown  in  Fig.  2. 

2 . 2  The  Structure  of  the  Receiver 

Receiving  arrays  consist  of  individual  filters  on  each  hydrophone  output, 
a  post-summation  filter,  a  square-law  device,  and  an  averaging  filter.  A 
schematic  diagram  is  shown  in  Fig.  3.  They  are  commonly  used  in  sonar  systems 
to  increase  the  ratio  or  desired  signal  power  received  to  undesired  noise  power 
received  from  other  sources.  The  hydrophones  are  assumed  to  be  omnidirectional 
ar.d  to  be  passive,  i.e.,  they  receive  signals  from  the  surrounding  environment. 
No  signals  are  transmitted  to  the  environment  from  the  hydrophones. 

The  signal  received  by  the  array  from  the  hydrophones  are  assumed  to  result 
from  two  separate  mechanisms: 

1)  One  component  produced  by  the  target  signal  propagating 
in  the  medium  surrounding  the  hydrophones. 


B-12 


2)  A  second  component  produced  by  ambient  noise  and 
interfering  sources. 

The  total  output  signal  from  the  hydrophone  to  the  processor  is  the  sum 
of  the  signals  described  in  the  above  if  the  target  signal  is  present  or  just 
the  second  component  in  the  absence  of  the  target  signal.  Normally,  the  signal 
components  from  the  individual  hydrophones  are  related  to  each  other  through 
some  simple  linear  transformation  (such  as  a  pure  delay) ,  while  the  noise 
components  from  these  hydrophones  are  relatively  less  correlated  unless  some 
interferences  are  present. 

The  principle  of  the  array  is  that,  by  suitably  adding  the  outputs  of  indivi 
dual  hydrophones  (perhaps  after  a  linear  transformation  is  applied  to  each),  the 
signal  components  may  be  made  to  add  up  faster  than  the  noise  components.  Then, 
the  ratio  of  signal  power  to  noise  power  at  the  summing  junction  or  array  beam- 
former  output  may  be  higher  than  at  the  individual  hydrophone  outputs.  It  is  als 
true  that  array  systems  are  essentially  matched  filters  in  space;  a  directional 
signal  Is  matched  by  a  directional  receiver.  The  directionality  of  an  array  is 
obtained  by  properly  delaying  the  target  signal  from  each  hydrophone  and  summing 
the  result.  This  addition  is  coherent  for  signals  coming  from  the  direction 
corresponding  the  delays,  but  incoherent  in  other  directions.  Therefore,  a 
target  signal  can  be  distinguished  from  the  noise  because  of  its  directivity  and 
a  directional  array  is  needed  to  detect  it. 

The  Optimum  Receiver 

As  shown  in  Fig.  3  the  array  processor  consists  of  individual  filter 

H^w),  i  -  1,2, - ,K  ,  on  each  hydrophone,  a  post-summation  filter  G(w)  ,  a 

square-law  device,  and  an  averaging  filter  H  (w)  .  Although  G(u)  can  be 

av 

included  in  the  individual  filters,  it  is  considered  separately  for  convenience. 

There  are  several  criteria  of  optimization  of  3  processor  for  an  array  of 
hydrophones.  It  has  been  shown  by  many  authors  [3]  -  [7]  and  very  briefly  in 


Appendix  A  that  the  optimum  Individual  filters 

HT(u>)  -  {H^(ut)  H2(oi)‘“  (2.2-1) 

for  the  models  described  in  Sect.  2.1  are  of  the  form 

H  (uj,  m  $  a  (2.2-2) 

The  form  of  the  individual  filters  is  found  to  be  invariant  under  changes 
of  optimization  criteria.  Only  the  optimum  post-summation  filter  G(w)  needs 
to  be  modified. 

Assuming  Gaussian  statistics  for  both  the  signal  and  noise  and  using  a 
likelihood  ratio  test,  G(u>)  is  found  to  be 

Gt  (w)  =  $2  [i  +  <p  a*T  r1  a]“  1  (2.2-3) 

l  a  a  —  — nn  — 


If  one  is  interested  in  estimation  and  minimizes  the  mean  squared  error  between 
the  target  signal  and  summer  output1,  the  appropriate  filter  is 

Gm(m)  =  c£  («)  (2.2-4) 

If  one  maximizes  the  signal-to-noise  ratio  at  the  detector  output  G(w)  is 

1 

G^(m)  =  G^(o))/4i^^(u))  (2.2-5) 

An  interesting  simplification  occurs  for  the  case  of  small  signal-to-noise  ratio 
at  the  input  to  the  squarer,  or  when 

%  a^iT1  a«  1  (2.2-6) 

then  ^ 

CL(m)  -  GM(u.)  =  <i^(u>)  (2.2-7) 

and  the  detector  structure  is  essentially  the  same  regardless  of  the  design 
criteria  either  to  use  a  likelihood  ratio  test  cr  to  maximize  the  output  signal- 
to-noise. 


This  is  really  not  what  we  intend  to  do,  but  Eq.  (2.2-4)  is  included  here  for 
reference. 


B-15 


From  Eqs.  (2.2-2)  and  (2.2-7)  we  can  alternatively  writs  the  optimum 
filters  as 


H 

-op 


.*-1 

^  A 

-nn 


*  *-l  * 

a  A  id£ 


da  Li  ■* 

—  =-  -nn 


and  the  post~9ummation  filter 
1 

G(u)  -  i"2(u>) 


(2.2-8) 


(2.2-9) 


,  appearing  in  Eq.  (2.2-8),  is  just  the  spectral  vector  between  the  refer¬ 
ence  signal  and  various  signal  components  of  the  hydrophone  outputs. 

In  any  system  which  operates  in  a  realistic  noise  field,  the  optimal  filters 
must  be  periodically  revised.  It  is  important,  therefore,  that  these  filters 
assume  a  form  that  can  readily  be  changed.  The  results  concerning  optimal  filters 
described  by  Eqs.  (2.2-2)  and  (2.2-8)  have  assumed  that  the  filters  are  arbitrary 
without  constraints  and  cannot  be  constructed  without  statistical  knowledge  about 
both  the  signal  and  the  noise.  In  the  following  section  we  shall  examine  the 
rather  practical  situation  in  which  filters  in  an  array  consist  of  weighted- 
tapped-delay  lines.  This  type  of  filters  can  approximate  the  physically  unreali¬ 
zable  Wiener  filters  such  as  Eq .  (2.2-2)  or  (2.2-8)  to  any  desirable  degree. 
Furthermore,  adaptive  techniques  can  be  applied  to  automatically  adjust  the 
weights  on  these  lines  without  using  noise  statistics.  The  relationships  between 
tapped-dt lay-line  filters  and  the  Wiener  filters  are  discussed  very  shortly.  The 
id.-iptive  part  will  be  treated  in  Chapter  Three. 


2 .  3  r-icpt-d-D^Iav-Linc  Filters  in  an  Array 

Wu  shall  first  of  all  describe  the  structure  of  a  tapped-delay-line  multi¬ 
channel  filter  processing  the  outputs  of  K  hydrophones.  The  output  of  each 
hydrophone  enters  a  tapped-delay-line.  and  is  picked  off  at  various  taps (usually 
equally  spaced)  on  the  delav  line,  delayed  in  time  but  unchanged  in  wave-shape. 
The  slcn.-.l  from  each  tap  is  passed  through  an  associated  variable  attenuator 
(tiie  wel.-ht);  all  the  attenuator  output  signals  are  then  summed. 


B-16 


Figure  4  .  Tnpjii-rl -pel.i  y  -  Line  Filler  in  an  Array 


B-17 


a)  Notations  and  the  Filter  Output 

It  is  seen  in  Fig.  4  that  each  delay  line  consists  of  (Hfl)  tap  points  lead¬ 
ing  fco  (M+i)  weights  c , «  The  tap  points  are  separated  by  M  ideal  delays  of  A 
seconds  each.  Note  Chat  each  weight  is  indexed  by  its  tap  point,  and  each  tap 
point  is  identified  by  the  index  of  the  succeeding  delay. 


Define 

cik  "  ^  wel8ht  on  the  *th  filter  (2.3-1) 

w(i-l)M+k  *  cik  (2.3-2) 

(l-l)M+k  "  (2.3-3) 

(l-l)M+k  "  91(t-kA)  (2.3-4) 

v(c)  -  n  (t-kA)  (2.3-5) 

(i-l)M+k  1 

f 


where  1  »  1,  2,...,  K  ,  is  the  hydrophone  index  and  k  -  0,1,2,...,M  ,  i9  the  tap 
point  indux.  Using  vector  notation,  the  column  vector  of  output  signals  n.  for 
the  entire  hydrophone  array  may  be  written  as  the  sum  of  delayed  target  signal 
vector  C  and  a  delayed  noise  vector  v_  ,  or 


2  -  2  + 

where 

■l  -J  ('’.(t)  r  (t)  •••  n(t)  ] 

K(M+1) 

fT  4  U.(t)  * (t)  •••  C(t)  ] 

K(M+1) 

2  4  ['•'.(t)  v  (t>  v(t)  ] 

1  1  K(M+1) 

T 

and  n  denotes  the  transpose  of  £_  . 


(2.3-6) 

(2.3-7) 

(2.3-8) 

(2.3-9) 

If  the  weight  vector  W  defined  as 


Wl  -  (wx  w2  ...wK(M+1)) 
the  filter  output  z(t)  Is  tuen 

?.'t)  -  WT  n(C) 


(2.3-10) 


(2.3-11) 


B-18 


Note  that  W  and  n_  are  K(M+l)-dimenaional  vectors. 

The  equations  given  above  express  the  continuous  time  notation  for  the 
variables  used  in  this  research.  Uniform  discrete-time  samples  of  these  quanti¬ 
ties  are  also  of  interest  and  are  expressed  by  using  the  discrete-time  index  j  . 

Time  samples  are  assumed  to  be  taken  at  intervals  of  T  seconds  and.  for 

samp 

notational  simplicity,  the  values  of  the  various  parameters  at  the  j1"*1  sampling 
instant  are  expressed  as 


2Lj  4  2.(0 


(2.3-12) 


1  T 


samp 


4  z(t) 


(?  3-13) 


t  =  j  T 


samp 


Because  the  present  work  is  concerned  with  iterative  weight-adjustment  pro- 

pi 

eedures,  the  j  sampling  instant  is  associated  with  the  j  iteration  of  the 
weight  vector.  Thus,  the  value  of  the  K(W-1) -dimensional  weight  vector  at  the 
jth  iteration  is  W_.  .  Hence  a  weight  parameterized  by  a  discrete  time  index  1 
is  interpreted  as  the  j  th  iterated  value  of  the  weight,  while  an  unparameterized 
weight,  as  in  (2.3-2),  is  interpreted  as  a  time-invariant  quantity, 
b)  Autocorrelation  Matrices  of  the  Input 

When  both  the  target  signal  and  the  noise  processes  are  described  in  terms 
of  their  statistical  properties,  the  performance  of  the  system  can  be  evaluated 
in  terms  of  its  average  behavior.  The  quantity  of  most  interest  is  the  second 
statistical  moment.  For  the  K-dimensional  vector  of  array  output  signals,  X(t), 
the  second  moment  becomes  the  (KxK) -dimensional  autocorrelation  matrix  R  (t) 
given  by 


^(t)  4  Etx(t)  x  (t-T)] 


(2.3-14) 


where  X  (t)  -  [x^(t)  x-,(t)  •••  x^t)] 
E [ *  1  denotes  "expected  value", 


B-19 


and  t  is  a  running  time-delay  variable.  For  the  K(M+1) -dimensional  vector  of 
all  the  signals  observed  at  the  weights,  n_(t),  the  second  moment  is  the 
K(M+1)  x  K.Qtt-1) -dimensional  autocorrelation  matrix  R^  given  by 


R  (t)  -  E[n (t)  n  (t-t)] 

— n  — 

Using  Eq.  (2.3-7)  in  Eq .  (2.3-15)  gives 


(2.3-15) 


R  (t)  -  E 


nx(t) 


,n(t)  J 

K(M+1) 


[n, (t-t) • • -n(t-T)  ] 
K(M+1) 


(2.3-16) 


and,  using  Eq.  (2.3-14),  the  second  moment  R  becomes 


R  (r) 


R  (t-A) 

— X 


R  (t+A) 
—x 


R^Cz+HA) 


(2.3-17) 


R^(t-MA) 


The  above  matrix  is  in  the  form  of  a  Toeplitz  matrix  having  equal  matrix-valued 
elements  along  any  diagonal.  Note  that  by  the  assumption  of  independence  of 
signal  and  noise  components,  we  have 


RJt)  -  Rr<T)  +  R^t) 


(2.3-18) 


where  R, (z)  and  R^(z)  are,  respectively,  the  K(M+1)  x  K(M+1) -dimensional  signal 
and  noise  autocorrelation  matrices  given  by 


Rf  (t)  =  E[£(t)  £_  (t-t)  ] 


(2.3-19) 


R  (t)  =  E[^(t)  v_  (t-t)] 


(2.3-20) 


These  matrices  are  also  of  the  Toeplitz  form  analogous  to  Eq .  (2.3-17).  The 
advantage  of  the  Toeplitz  configuration  is  that  the  entire  matrix  can  be  con¬ 
structed  from  the  first  row  of  the  submatrices  -  i.e.,  from  the  matrices  R^Ct) , 

R  (t+A),  R  (t+MA) ,  in  Eq,  (2.3-14).  Thus  the  K(W-l)  x  K(M+l)-dimansional  auto- 
-x 


B-20 


correlation  matrice  can  be  stored  as  a  K  x  K(M+1)  -dimensional  matrix: 
c)  Optimum  Weights 

The  difference  between  the  summer  output  and  the  desired  (reference)  signal 
is  the  error  function 


e(t)  •  d(t)  -  z(t) 


(2.3-21) 


using  the  notation  defined  previously  we  can  write  the  square  of  the  error  as 


e2(t)  A  Q(e)  *  d2  -  2  dnTw  +  WT  n  nT  W 
the  mean  value  of  which  is 


(2.3-22) 


i2  m  Q(e)  =  d2  -  2  dnT  W  +  WT  R  W 
—  —  —  — n  — 


(2.3-23) 


To  obtain  the  optimum  vector  which  minimizes  the  mean-squared  error,  we  take 
the  gradient  of  Eq.  (2.3-23)  and  set  the  resulting  form  to  zero, 


vw  ez  -  -  2  dn  +  2  (R£  +  R^v  =  0 


(2.3-24) 


W  *  (R.  +  R  )  R... 

-op  — $  — v  — d£ 


(2.3-25) 


where 


-  ld(t)ni(t) 


d(t)n(t)  ] 
K(M+1) 


-  [d(t)?1(t) 


d(tK(t)  ] 
K(M+1) 


(2,3-26) 


is  determined  completely  by  the  signal  correlation  function  R^(r)  and  various 

delays  for  Independent  signal  and  noises.  Note  that  in  Eq .  (2.3-25)  R^ , 

and  R,^  are  shorthand  for  R, (o)  and  R,  (o)  . 

—at,  — €  —dK 

d)  Effect  of  Non-Optimum  Gains 

The  effect  of  non-optimum  gains  on  the  minimum  mean-squared  error  Is  con¬ 


sidered  here.  The  absolutely  minimum  squared  error  achievable  in  using  the 


B-21 


tapped-delay  lines  is  obtained  by  substituting  Eq.  (2.3-25)  into  Eq.  (2.3-23) 


min 


d2  -  2  dnT  W  +  W  T  R  W 

—op  —op  — p  —op 


2  T 

=  a  - rL  w 

— d£  -op 


2  T  -1 

— d£  — T)  — d£ 


(2.3-27) 


Using  Eq.  (2.3-27)  and  Eq.  (2.3-25),  Eq.  (2.3-23)  can  be  expressed  as  for  any 
fixed  W  as 


2  T  T 

d  -  2  Rj,  W  +  W*  R  W 

-nlf.  —  —  — n  — 


2  T  T  T 

e  .  +  R,r  W  -  2  r:  U  +  W R  W 
min  -nit  —op  — d£  —  —  — p  — 


e2._  +  w  T  RW-2WTRW  +  WTR  W 
min  op  —p  —op  —op  --p  —  —  — p  — 


e2 +  (W  -  W  )T  R  (W  -  W  ) 
—  —op  — p  —op 


min 


(2.3-28) 


If  we  relate  the  arbitrary  gain  to  the  optimum  one  by 

W  -  W  +  AW 
-  -op  - 

then  from  Eq .  (2.3-28)  the  difference  in  mean  squared  error  due  to  non-optimum 
values  of  W  is 


2  2  2 
Ae  “  e  -  e  . 

min 


(W  -  W  ) 1  R  (W  -  W  ) 
—  —op  — n  —  —op 


K(M+1)  K(M+1) 


■  l®‘  4,  ("1  •  ill  h=l  4"l  S"h  \  "l 


<_  K  (H-fl)  max  |  Aw.  |  tuax|  n  ,  n.  j 
all  i  1  all 
i.h 


(2.3-29) 


B-22 


Thus,  the  error  due  to  non-optimum  gains  is  bounded  if  the  deviations  of  the 

gains  and  the  input  correlation  functions  are  bounded.  Note  especially  that 

for  tapped-delay-line  filters  max  In.  n.  |  «  R,(o)  +  R  (o)  . 

all  i,h  1  h  d  n 


The  multichannel  Wiener  filter  which  minimizes  the  mean-squared  error 
between  the  summer  output  in  an  array  and  the  target  signal  is  obtained  by 
combining  Eqs.  (2.2-2),  (2.3-3)  and  (2.2-4).  The  individual  filters  in  this 
case  become 


H  <u>)  - 
m 


* 

$  a 
— nn  — 


,  *T  -1 
►  •  a  4>  a 
d  —  -nn  — 


,  *  *  -I  *  * 

— ss  — nn  ‘  —  Yd  -xx  -Lds 


(2.4-1) 


where 


$  (cu) 
—xx 


is  the  input  spectral  matrix 


$  (w)  *  0  (u>)  +  $  (id) 

-xx  -ss  -nn 


(2.4-2) 


and  itjs(w)  is  the  spectral  density  vector  between  the  desired  signal  and 
various  signal  components  of  the  hydrophone  outputs. 

Eq.  (2.4-1)  is  the  generalization  of  the  single  channel  Wiener  filter 


Ha<“) 


<t>d(w) 

(|>d(lD)  +  4>n(w) 


^>d(m) 

<t>xCco) 


(2.4-3) 


for  the  case  of  long  delay  a  -»■  -  »  ,  [1]  .  The  factor  is  missing  in 

Eq.  (2.4-1),  but  it  is  understood  that  Wiener  filters  of  this  type  are  not 
physically  realizable.  Although  they  cannot  be  constructed  by  RtC  networks, 
they  can,  however,  be  approximated  by  tapped-delay  lines  in  practice.  We 
will  first  of  all  show  that  the  tapped-delay-line  filters  with  gains  given  by 
Eq.  (2.3-25)  will  approximate  the  continuous  Wiener  filter  Eq.  (2.4-1)  In  the 


B-23 


mean  squared  sense. 

Consider  a  transfer  function  vector 

Vw)  "  ^1(a,)  Ww) 


(2.4-4) 


and  the  transfer  function  vector  for  the  tapped-delay-line  filters 


**<»>  -  Jo  it  * 


-2 


-jwkA 


(2.4-5) 


where 


\  *  [clk  C2k  cKk3  *  k  ‘  O'1’2’*"’  M 
Equating  Eqs.  (2,4-4)  and  (2.4-5)  yields 


-jwkA  *-1,  .  *  .  . 

k^O  %  e  “  (w)  *ds(u>) 


(2.4-6) 


Premultiplying  both  sides  of  Eq.  (2.4-6)  by  4^  gives 


k“0  £<u)  ^k  =  4s(a)) 


Multiplying  the  above  equation  by  and  integrating  in  the  frequency  domain, 

we  have 
M 


(2.4-7) 


E  i- 

k»0  2tt 


,  **  /•  \  ju(i-k)A  1 

dw  ^(w)  eJ  %  “  2? 


,  \  j«£A 

dw  ^js (u)  e 


(2.4-8) 


But  the  frequency  integrations  are  just  correlation  functions,  i.e., 


1_ 

2ti 


*.  .*  <.) 


(ZA-kA)  •••  R  (ZA-kA) 
X1XK 


X1X1 


R  (ZA-kA)  •••  R  (ZA-kA) 

^  Vi  Vk 


¥  (ZA-kA) 
xx 


(2.4-9) 


B-25 


and 


h  d“  ils(u)  eJ 


R.  UA) 

u8  ^ 


RdsK(U) 


A  i*.  <**> 


(2,4-10) 


Thus  Eq.  (2,4-8)  can  be  written  as 

Jo  2*x  <iA'ka>  %  “  ^ds  (1A)  ’  1  *  M 


(2.4-11) 


ji  ji  RXv^(kA-U)  Cik  =  Rds±(U) 


(2.4-12) 


for  t  =  0,1, 2, •••,«;  b  -  1,2, •••,  K  -  Using  the  definitions  of  various 
correlation  matrices  we  see  that  Eq.  (2.4-12)  is  equivalent  to  Eq.  (2.3-25). 

H  (u)  is  approximated  by  H^m)  in  the  sense  of  minimizing  the  mean 
squared  error  in  the  frequency  domain.  For  the  sake  of  simplicity,  let  us 

first  treat  the  case  of  a  single  filter. 

If  the  transfer  function  H(«>  can  be  regarded  as  being  band  limited  to 

f_u  <  a  <  u  )  ,  then  by  simple  Fourier  expansion  we  have 

^  ft  r 


H(m)  -  k|.«  ck  6 


-JuikAu) 


(2.4-13) 


where 


(2.4-14) 


and  the  c's  are  Fourier  coefficients 


„  & .  r 


HCO^  f  ■  ! 

i. 


,  .  ikAu)  daj 
H(m)eJ  2tt 


(2.4-15) 


B-26 


The  impulse  response  of  the  Wiener  filter  Eq.  (2.4-3)  generally  takes  the 

form  shown  In  Fig.  5;  i.e.,  it  has  a  peak  value  at  t  -  a  ,  and  |h(r)j  ■+  0  as 

|t  -  o|  -*■  »  .  The  memory  of  the  filter  can  therefore  be  defined  as  the  value 

of  t  for  which  the  ratio  |h(t)|/|h  (t) I  has  some  predetermined  small  value 

max 

that  Is  not  exceeded  for  t  >  on  the  positive  side  and  t  <  0  on  the  negative 
side.  If  the  noise  spectrum  is  relatively  flat  the  filter  memory  is  proportional 
to  the  correlation  time  Tfi  of  the  signal. 

For  a  filter  having  a  finite  memory  -*■  0  for  sufficiently  large  k  •  In 
genera),  c  .  m  c^-  and  hence  if  -*■  0  so  does  c  ^  .  The  infinite  series  of 
Eq.  (2.4-13)  can  therefore  be  truncated  to  a  finite  aeries  running  from 
k  •  -  M/2  to  M/2  (where,  for  simplicity,  M  is  assumed  to  be  an  even  number), 
and  the  resulting  finite  sum  will  approximate  B(u)  as  closely  as  is  derived  by 
making  M+l,  the  number  of  taps,  large  enough.  Then 


H(u>) 


M/2 

k--M/2Ck 


e-jUk4 


M 

1»0  Ci.-M/2 


e-ju>AU-M/2) 


JoiAM/2  ”  ,  -jlAm  (2.4-16) 

"  e  ISO  C!  e 

where  i  *  k  +  M/2  and  =  c£-m/2  ' 

It  is  readily  seen  that  the  summation  terms  in  Eq.  (2.4-16)  can  be  constructed 
using  veighted-tapped  delay  lines.  The  middle  of  the  delay  line  corresponds  to 
the  term  k  •  0. 

The  minimum  mean  squared  error  resulting  from  the  process  of  approximating 
M/2  , 

H(m)  by  -K^/2  °k  e  ^  is  by  choosing  the  coef f icier.ts  in  accordance 

with  Eq.  (2,4-15) 


B-27 


f  2lt(d 

1  I 
2*  I 

^  — 2»r  u 


0  M^2  .  *  * 

!H(ui)  "  k*-M/2  Ck  6_J  |2  dU 


1_ 

2n 


2nu) 


“2TICU 


M/2 


i  r 


M/2 


2w  1  ;HU)i‘  du)  '  2  k^o  lck!‘ 


OD  j  I  2 

2  k-M/2+1  ck'  (2.4-17) 

Thus  M  Is  determined  by  the  maximum  possible  value  of  e  .  Since  |cjJ  0  for 
values  of  k  such  that  k  A  »  ,  the  signal  correlation  time,  it  also  follows 

that  M  is  proportional  to  T^/A  with  the  proportionality  constant  chosen  to 
produce  an  acceptable  mean  square  error.  A  typical,  value  of  M  might  be 

4  Tc/a  . 

The  extension  of  this  argument  to  arrays  of  filters  such  as  in  Figure  3  is 
immediate  except  that  in  general  the  filters  must  introduce  additional  delays 
In  order  to  steer  the  array.  Hence  the  Impulse  response  of  hi(t),  the  ith 
filter,  peaks  at  c  =  t  and  diminishes  for  values  of  t  away  from  .  If  all 
the  filters  consist  of  delay  lines  having  M+l  taps,  then  by  reference  to  Eq. 
2.4-16  one  can  make 


Hi(u) 


M/2 

k=M/2 


j-jj(k+k1)1J 


j -  ('■;/ 2-k, ) A.  “  ,  -jlAw  (2.4-18) 

e  1  i  =  0  C£,  e 

where,  as  in  Eq.  (2.4-16),  i  =  k+M/2  ,  =  cj-m/2  ’  and  where  ln  addltlon 

*  t  .  The  value  of  ^  for  which  c’  ln  maximum  is  then  given  by 


B-28 


then 


i  *  M/2  -  .  Also,  if  the  maximum  delay  is  such  that 

"  max 

M  must  clearly  be  increased  by  k  .  over  the  value  needed  for  a  single  filter 

max 

applied  to  the  same  signal  spectrum,  i.e.,  typically  M  might  be 

4T 

K  -  -j5  +  kt  (2.4-19) 

max 

2 . 5  The  Effect  of  Interferences  on  the  Processor  Structure 

In  this  section  ve  consider  the  effect  of  interferences  on  the  processor 
structure  under  the  assumptions  that 

1)  The  input  spectra  are  identical  in  shape  (but  not 
in  power  levels)  over  the  frequency  range  (0,u>  ) 
where  most  of  the  input  power  is  concentrated. 

2)  The  directions  of  the  interferences  are  known 
exactly  and  the  tap  sparing  are  set  to  the 
desirable  amount. 

The  optimum  individual  filters  in  a  general  array  configuration  are  re¬ 
written  here  for  convenience 


% 


*-l  * 

*  (f  a 
-nn  d  — 


(2.5-1) 


where  the  noise  spectral  matrix  are  given  by  Eq.  (2.1-5) 


4 

-nn 


L 

+  1, 

i*l  —L 


(2.5-2) 


In  the  above  4's  are  understood  to  be  function  of  u>  ;  4  is  the  ambient 

—  — D 

noise  spectral  matrix,  and  the  i^(i  =  1,2, — h)  are  the  interference  spectral 

matrices.  The  signal  and  noise  models  shown  in  Fig.  1  and  2  are  used  here. 

When  necessary,  a  superscript  will  be  used  to  indicate  various  interferences. 

£  1) 

For  example,  the  i  Interference  delay  is  defined  as 


(i) 

I  e  e 


(i) 


J  ^,F 


rn 


}  i  -  1,2,-.. 


,  L 


a)  No  interference 

If  there  is  no  interference,  the  general  noise  spectral  matrix  reduces  to 


B-29 


DELAY 


Fit'll-'.-  t,  •  Structure  of  Tapped  -  Delay  -  Line  Fillers 


B-30 


the  ambient  noise  spectral  matrix.  Let 


$ 

"nn 


$ 

-o 


P11  *12 


*K1  '  '  ^KK 


(2.5-4) 


The  optimum  filters  defined  in  Eq.  (2.2-8)  become 

-1  *  K  “J“Tm 

H  ■  <f  .  4>  ^  a  *  E,  qe 

—op  d  —o  —  m*l  -mn 

where 

"  ^lm  ’  ‘  ’ 

and 

q.,  i3  the  i-h  element  of  [if  ^ 

ih  d  —o 


(2.5-5) 


Z. i.nce  the  input  spectra  are  assumed  to  he  similar,  the  term  V*d  wil1  be  a 
constant  for  i  *»  1,2,  — ,K  ;  h  *  1,2,  — ,K  ,  K  being  the  number  of  hydrophones 
in  an  array.  The  i  optimal  filter  can  be  constructed  using  X  taps  with  the 
weights  set  at 


cik  =  qik 


C2. 5-6) 


for  the  input  signal  x^(t)  delayed  by  seconds.  The  implementation  is 

shown  in  Fig.  6a. 

If  the  ambient  noise  is  independent  from  hydrophone  to  hydrophone,  we  then 
have  the  simpler  implementation  shown  in  Fig.  6b.  Here  only  a  single  gain 
c  =  -  q.^  is  used  to  weigh  the  delayed  input  x^Ct-x^)  .  This  system 

is  similar  to  that  studied  by  Schultheiss  for  likelihood  ratio  detection  of 
Gaussian  signals  with  noise  varying  from  element  to  element  of  the  receiving 
array  [39].  Furthermore,  if  the  ambient  noise  power  is  identical  to  all  hydro¬ 
phones,  a  conventional  beamformer  is  obtained.  Thus  the  cost  c£  implementing 


B-31 


optimum  filters  depends  largely  on  the  degree  of  noise  correlation  between 
various  hydrophones . 

b)  Single  Interference 

If  there  is  only  a  single  interference  and  the  ambient  noise  is  statistically 
independent,  from  hydrophone  to  hydrophone,  the  optimum  filters  are 


%  "  t*0  I  +  h  b*  b1]'1  +d  a* 


(2.5-9) 


and  the  i  row  is  just 


*d  <  “J“Ti 

■  r(s  - 

0 


*  K  ju>(p„  -  t  ) 


K+4>  /tr  m*l  6 
o  i 


m  nr  J 


(2.5-10) 


If  the  input  spectra  are  identical  in  shape,  then 


S  t  “3wTi  e~jU)Pi 
N  1  e  "  K.  +  N/ 


M  jw(p__  -  t_) 


N/I  mil  ® 


-  t  )  ) 

m  m  J 


(2.5-11) 


where  S,  N,  and  I  are  respectively  the  power  levels  of  target,  ambient  noise, 
and  interference.  Hie  filter  defined  by  Eq.  (2.5-11)  can  be  constructed  by 
setting  the  gains  according  to 


c  a  —  (<s  -  - i — — ) 

ik  N  ^  ik  K  +  N/r 


(2.5-12) 


at  taps  corresponding  to  time  delays  of 


5i  -  °k  +  Tk 


(2.5-13) 


for  i  *>  1.2, - ,  K  and  k  =  1,2, - ,  K  .  (5^  is  the  Kronecker  5).  Hera 

the  number  of  taps  on  each  individual  filter  is  equal  to  the  number  of  hydro¬ 
phones  in  the  array  (M  =  K) .  The  effect  of  interference  appears  in  summation 
terms  of  Eq.  (2.5-11).  The  impulse  response  of  the  itfl  filter  under  this 
situation  will  consist  of  two  oarts.  There  is  a  positive  spike  of  strength 

~  (1  77 - rrrr  )  at  t  =  x,  and  negative  impulses  of  strength  S  1  v  at 

N  K  +  h/I  i  n'k  +  N/I1 

~  —  4-  t 

"i  k 


B-32 


c)  Two  or  More  Interferences 

In  Che  presence  of  L  interferences,  the  inverse  of  a  general  noise 

|  spectral  matrix  has  been  investigated  by  Tuteur  [33].  The  results  are 

T 

|  ¥*1  -r 
,-l 


£  -  s.  -  C  ^  ‘[m  +  £1' 


4> 


-1 


(2.5-14) 


where  G  has  elements 

T  -1  * 


8 


ih 


,V*h  V  4 


(2.5-15) 


For  independent  ambient  noise; 

_I  ,  1^  «  unity  matrix 

Eq.  (2.5-14)  reduces  to 


r1- 

mn 


C-  S  ■^‘[hi  +  g]-1 


where 


g 


ih 


^i  *h  T  * 
- \ 


(2.5-16) 


(2.5-17) 


If  two  interferences  are  present,  L  =  2  ,  the  inverse  of  the  corresponding  noise 
spectral  matrix  becomes 


"  ♦o1  1  "  D~"  [(<J>o  +  h  KH1  ^1  ^1 

+  (4>q  +  *2  K)^2  b *  b j  -  2  4>2  (b*  b^)2] 


(2.5-18) 


where 


m  A 

(*0  +  <l,1  K)(4>0  +  ^2  ~  ^1  ^2  1—1  —2 1  2 


(2.5-19) 


B-33 


For  similar  input  spectra,  the  optimal  individual  filters  are 


* 

H  -  4-  <|> ,  a 

-op  — nn  d  — 


fa*  -  Sioi  +  n^bJbJ.* 

+  (N  +  KI2)X2  b*  b ^  a*  -  2  I^fb*  b*)2  a*] 


(2.5-20) 


where  1^  for  k  *  1,2  is  the  power  of  the  interference. 

The  ifc^  individual  filter  is  therefore 

S  "j“Ti  SN  -jwp  (1)  K  jw(Pm(1)  -  t  ) 

ll  -  I  *  -  r  !<“  +  ^I11!  •  .Si  •  * 

<2)  k  3u(p*2>  -  i  > 

+  (N  +  KI  )I  e  1  £  e  m 

2  2  m=»l 


-jup  (1)  K  K  J(0<p  <2)  -  p  (1)  +  p  (2)  -  t  )] 
“2IlI2fc  m£ln£le 


<2.5-21) 


■’’r-  Eq.  (2.5-21)  the  first  two  terms  inside  the  bracket  can  be  realized  using  K 

2 

taps,  but  the  last  term  would  require  X  to  produce  the  desired  impulse  response. 

It  is  readily  seen  from  Eq .  (2.5-16)  that  in  the  presence  of  L  interfering 

sources  XL  taps  would  have  to  be  used.  Since  the  number  of  hydrophones  in  an 

2 

array  may  be  large  (in  the  order  of  10  or  more) ,  the  cost  of  implementing 
optimal  filters  for  several  interferences  could  become  extremely  large. 

A  different  point  of  view  has  been  taken  by  Tuteur[45]  who  has  considered  the 
number  of  taps  on  the  delay  line  required  to  realize  the  angular  resolution  of 
which  the  array  is  capable.  For  the  particular  example  of  a  linear  array  he  has 
found  that  the  tap  spacing  needed  to  match  the  angular  resolution  ie  on  the  order 
of  1/BK  where  B  is  the  signal  bandwidth  in  hertz.  Since  the  maximum  delay  re¬ 
quired  to  steer  the  array  over  180°  in  azimuth  is  2(K-l)d/c  ,  where  d  is  the 
hydrophone  spacing  and  c  the  velocity  of  round  in  water,  the  number  of  taps 


B-34 


needed  to  provide  all  the  possible  signal  delays  required  by  the  p's  and  t's  in 
Eq.  (2.5-21)  is 

M  =•  2  BK (K  -l)d/c  (2.5-22) 

Although  for  typical  bandwidths  and  large  array  sizes  this  number  is  very  large, 
it  is  independent  of  the  number  of  interferences.  This  is  true  since  the  argu¬ 
ment  based  on  angular  resolution  implies  that  many  of  the  taps  required  to 
implement  Eq.  (2.5-16)  can,  in  fact  be  considered  identical.  Note  that  M  as 
given  by  Eq.  (2.5-22)  is  on  the  order  of  K  times  as  large  as  the  value  given  in 
Eq.  (2.4-19).  The  very  much  larger  estimate  obtained  here  is  a  direfct  result  of 
requiring  the  array  to  be  able  to  resolve  several  sources  at  different  angles 
simultaneously.  If  delay  lines  with  the  smaller  number  of  taps  given  by  Eq. 
(2.4-19)  were  used  one  would  expect  a  performance  degradation  resulting  from  the 
fact  that  the  array  could  in  general  not  be  precisely  steered  in  the  various 
nterference  directions.  The  extent  of  this  degradation  has  not  been  investigated  - 


B-35 


CHAPTER  THREE 


THE  ADAPTIVE  MECHANISM 

3.1  Introduction 

The  previous  chapter  presented  a  means  for  determining  the  optimum  values 
of  the  gains  provided  that  the  statistical  properties  of  both  the  desired  signal 
and  the  noise  are  known. 

In  the  present  chapter  a  method  is  developed  for  adjusting  the  gains  auto¬ 
matically  when  this  information  Is  only  partially  available.  In  particular!  it 
will  be  shown  that  adjustment  is  possible  if  only  the  correlation  function  of 
the  desired  signal,  or  (not  and)  of  the  noise  is  available. 

The  adaptive  filter  described  here  bases  its  own  design  (its  internal  q^ijust- 
able  gains)  upon  estimated  or  measured  statistical  characteristics  of  input  and 
output  signals.  The  statistics  are  not  measured  explicitly  and  then  used  to 
design  the  filter;  rather  the  design  is  accomplished  in  a  single  process  by 
recursive  algorithm  which  updates  the  adjustments  with  the  arrival  of  each  new 
data  sample. 

Two  of  the  most  commonly  used  iterative  methods  for  making  adjustments  to 
improve  system  performance  are  the  relaxation  method  and  the  method  of  steepest 
descent  (or  ascent) .  The  relaxation  method  involves  making  a  change  in  the  value 
of  only  one  of  the  controller  parameters  and  then  re-evaluating  the  performance 
measure.  If  the  performance  has  been  improved,  a  second  change  in  the  same 
direction  is  made;  otherwise,  the  first  change  is  retracted  and  a  change  in  the 
opposite  direction  is  made.  This  process  is  continued  until  no  further  improve¬ 
ment  in  the  performance  measure  can  be  accomplished  by  adjusting  that  particular 
parameter;  whereupon  the  same  process  is  repeated  for  each  of  the  remaining 
controller  parameters.  After  several  iterations  through  the  entire  procedure, 
the  controller  parameters  tend  toward  that  set  of  values  which  yields  the  optimum 
performance  measure. 


B-36 


The  methods  of  steepest-descent  (or  ascent) ,  referred  as  gradient  techniques 
are  operated  In  a  manner  similar  to  the  relaxation  method,  with  the  notable  excep¬ 
tion  that  all  parameters  are  adjusted  simultaneously  rather  than  sequentially. 

This  is  done  by  measuring  the  partial  derivative  of  the  performance  measure  with 
respect  to  each  of  the  controller  parameters  and  then  adjusting  all  the  para¬ 
meters  in  such  a  way  that  the  net  effect  1b  the  largest  possible  improvement  In 
the  performance  measure.  A  number  of  techniques  have  been  developed  for  deter¬ 
mining  the  partial  derivatives. 

The  most  straightforward  method  is  to  perturb  each  of  the  parameters  sequen¬ 
tially  and  measure  the  derivatives  directly.  This  procedure,  however,  offers 
little  advantage  over  the  relaxation  method.  A  second  technique  is  to  perturb  the 
parameters  simultaneously  in  such  a  manner  that  the  effect  of  the  perturbation  of 
each  parameter  on  the  performance  measure  will  be  distinguishable  from  the  effects 
of  the  perturbations  of  all  the  other  parameters.  Ways  in  which  this  may  be  done 
include  perturbation  by  independent  random  noise,  distinguishing  the  individual 
effects  by  correlation  detection  [14];  or  perturbation  by  frequency-separated 
sinusoids,  distinguishing  the  effects  by  narrow-band  detection  [40] . 

Gradient  techniques  can  be  considered  as  the  special  case  of  the  more  gen¬ 
eral  method  of  stochastic  approximation,  by  which  either  deterministic  or  random 
problems  can  be  solved  with  ease.  In  this  chapter  adaptive  algorithms  will  be 
derived  to  automatically  adjust  the  weights  on  the  tapped-delay  lines  described 
in  the  previous  chapter.  The  methods  of  stochastic  approximation  will  be  used 
extensively  in  the  remaining  part  of  this  research. 

3.2  Methods  of  Stochastic  Approximation 

The  methods  of  stochastic  approximation  were  originally  developed  by  Robbins 
and  Monro  in  1951  [28].  Their  purpose  was  to  find  the  root  of  a  function  dis¬ 
turbed  by  measuring  noise.  The  term  "stochastic"  refers  to  the  random  character 
of  the  experimental  errors,  while  the  term  "approximation"  refers  to  the  con- 


tinued  use  of  past  measurements  to  estimate  the  approximate  position  of  the  goal. 
Kiefer  and  Wolfowitz  [29]  adapted  the  idea  of  stochastic  approximation  to  the 
problem  of  finding  the  maximum  of  a  unimodal  function  obscured  by  noise,  Blum 
[30]  used  the  gradient  method  to  extend  the  above  techniques  to  the  multi¬ 
dimensional  case.  Later  on  Dvorestzky  [31]  greatly  generalized  and  unified  the 
whole  theory  and  Kesten  [41]  derived  some  formulas  to  speed  up  the  rate  of  con¬ 
vergence  In  terms  of  the  number  of  changes  in  sign  before  a  certain  step. 

a)  Basic  Considerations 

Stochastic  approximation,  much  like  ordinary  successive  approximation  in  the 
absence  of  experimental  error.  Involves  two  basic  considerations — first  choosing 
a  promising  direction  in  which  to  search  and  selecting  the  distance  to  travel  in 
that  direction.  Picking  a  search  direction  is  no  more  difficult  for  stochastic 
than  for  deterministic  approximations,  for  one  simply  behaves  as  If  he  believed 
the  experimental  results.  Ignoring  entirely  the  possibility  of  error.  This  means 
of  course  that  the  experimenter  will  move  away  from  his  goal  whenever  he  Is  mis¬ 
led  by  the  vagaries  of  chance  error.  It  will  be  seen  that  such  temporary  set¬ 
backs  do  not  prevent  ultimate  convergence  if  the  step  sizes  are  chosen  properly. 

In  both  stochastic  and  deterministic  schemes,  the  corrections  are  made 
progressively  small  as  the  search  proceeds  so  that  the  process  will  eventually 
converge.  To  make  this  convergence  rapid,  one  would  like  to  shrink  the  step  size 
as  speedily  as  possible.  The  main  difference  between  stochastic  and  determinis¬ 
tic  procedures  is  in  fact  the  speed  with  which  the  steps  can  be  shortened.  When 
noise  Is  totally  absent  one  can  reduce  the  steps  very  rapidly,  but  when  there  is 
danger  of  an  occasional  jump  in  the  wrong  direction,  shortening  the  steps  too 
rapidly  could  make  it  impossible  to  erase  the  long-run  effects  of  a  mistake.  In 
the  latter  case  the  process  would  still  converge,  but  to  the  wrong  vrlue. 

b)  The  Ordinary  Methods 

Many  problems  In  modern  engineering  systems  design  can  be  reduced  to  that  of 


B-38 


finding  the  extrema  of  functions  of  several  variables 

f  I  -  Q(c^»  c2>...  cfl)  ■  Q(c) 

where  c  «*  {c, ,  c„,...,  c  }. 

—  i  i  n 


(3.2-1) 


Denoting  the  optimal  values  of  c_  by  ^  and  assuming  that  the  extremum  of  interest 

to  us  is  a  minimum,  we  can  obtain  the  solution  of  c  ■  ^  by  setting  the  gradient 

of  Q(c)  equal  to  zero,  i.e., 

V  Q(c)  «  0 
c  — 

3Q(c>  3Q(c) 


(3.2-2) 


where  V  Q(c) 
c  — 


3c, 


3c 


1  n 

Generally  a  closed-form  solution  cannot  be  obtained  for  (3.2-2),  so  iteration 
methods  are  required,  especially  the  gradient  method.  The  gradient  method  re¬ 
lates  the  coordinates  of  a  given  point  with  the  coordinates  of  the  preceding 
point  and  the  gradient  VQ(c)  .  The  algorithm  for  determining  can  be 

written  in  the  form 


£j+l  "  “  YVcQ(V 


(3.2-3) 


When  Q(c)  is  not  given  analytically  or  is  not  differentiable,  the  gradient 
9 Q  can  be  approximately  determined  with  the  formula 


(^(e,  a)  -  Q_(c,  a) 


2  a 


where 


Q±  (c,  a)  -  (Q(c  +  a  e^) , . . . ,  Q(c_ +  a  e^)} 

and  e.  denotes  the  base  vectors 
—1 

^  «  (1,  0,...,  0)  ,  *  (0,  0,...,  1) 

The  corresponding  algorithm  is  then 

.  c  v  :  V  -  Q-(^i  :  V 

-j+l  ^  '  2a, 


(3.2-4) 


(3.2-5) 


(3.2-6) 


(3.2-7) 


i 


B-39 


In  the  above  we  assumed  that  Q(c)  1b  a  deterministic  function.  If  we 

consider  a  function  Q(x|c)  ,  where  x  ■  (x, ,  . .  x  }  is  a  vector  cf  eta- 

—  —  1  i  n 

tionary  random  variables  with  distribution  P(x)  ,  it  is  natural  to  attempt  to 
find  the  extrema  of  the  mathematical  expectation: 


* 

1(c)  “  Q(x|c)P(x)dx  =*  E  (  (x|  c) 

V 


(3.2-8) 


The  condition  for  determining  the  optimal  value  is  of  the  form 

?l(c)  =  £{7  Q(x|c)>  «=  0 


(3.2-9) 


We  can  apply  the  algorithms  (3.2-3)  and  (3.2-7)  to  (3.2-9)  and  functional 
(3.2-8)  only  when  the  priori  distribution  P(x)  is  known  and,  consequently,  the 
mathematical  expectation  can  be  determined  beforehand.  Frequently,  however,  the 
probability  density  function  P(x)  is  unknown.  Nonetheless,  the  optimal  vector 
can  still  be  determined  by  applying  the  gradient  method  using  7^Q(x|c) 
instead  of  •  This  is  one  of  the  advantages  of  using  the  method  of 

stochastic  approximation.  With  this  method  the  algorithms  for  determining 
can  be  written  in  the  form 


Vi  -  %  -  ')  'c  «%!%> 


(3.2-10) 


if  Q(x|c)  is  analytic  and  differentiable,  and 
Y  . 

fj+1  =  -c-j  “  2aj  aj)  "  aj}) 


(3.2-11) 


if  "cQ(x!.?)  dots  not  exist.  Here  determines  the  pitch  of  the  algorithm 

and  generally  depends  on  the  index  of  the  step  and  the  function  Itself. 

Algorithm  (3.2-10)  is  a  multivariate  form  of  the  Robbins-Monro  procedure, 
while  algorithm  (3.2-11)  is  a  multivariate  form  of  the  Kiefer-Wclf ouitz  scheme. 
The  analogy  between  deterministic  and  stochastic  algorithms  is  apparent.  It 
should  be  emphasized  however,  that  stochastic  algorithms  deal  with  stationary 
random  variables  which  may  contain  random  noise  in  addition  to  the  useful  signal. 


B-40 


c)  Convergence  Conditions  and  Their  Geometrical  Significances 

We  shall  describe  the  conditions  under  which  the  above-mentioned  algorithms 
converge.  Since  the  mean  squared  error  is  used  throughout  this  study  as  the  only 
performance  criterion,  Q(x|£)  is  analytic  and  differentiable  and  we  therefore 
need  to  consider  only  algorithm  (3.2-10). 

Let  satisfy  the  equation 

E{7c  Q(xic)}  -  0  (3,2-12) 


E{V  Q(xJ c) )  is  a  set  of  real  measurable  functions  of  real  variables  c  such  that 


E{VcQ(x|c)}  «  0  for  £  ”  £ 


-op 

where  c  is  a  constant  vector,  and  where  c  <  c  means  c,  < 


(3.2-13) 


-op 
for  all  i  . 


-op 


op 


Theorem  1  :  Let  y^  ,  •••  be  a  sequence  of  positive  numbers  such  that 


(A  1)  lim  y  =>  0 
i  Ho  J 


(A  2)  y 


(3.2-13) 


(A  3)  j£1 


Yj  <  00 


Let  the  following  conditions  be  satisfied 

inf  (c  -  c  )T  E{V  Q(x| c) }  >  0  (3.2-14) 

—  —op  c  —  — 

(B)  e  <  |  |c  -  e^H  <  f 
€  >  0 


(C)  E{7  TQ(x|c)  V  0(x|c)}  <  d(cT  c  +  cTc)  (3.2-15) 

c  c  o  p  — o  p 

for  all  £  in  a  bounded  set  and  d  >  0 

Then  the  sequence  defined  by  (3.2-10)  converges  with  probability  one  to 

c 

—op 


B— 11 


Proof:  The  above  theorem  has  been  proved  by  many  authors.  An  outline  is  given 
in  Appendix  B  . 

We  see  that  there  are  several  restrictions  imposed  on  the  sequence  (y 
as  well  as  on  the  behavior  of  the  function  V  Q(xjc)  •  These  conditions  not 
only  guarantee  the  convergence  of  the  algorithms  but  also  possess  certain  geometri¬ 
cal  meanings. 

(1)  >  0  ,  This  is  to  assure  that  the  corrections,  on  the  average,  are  to  be 
mad*51  In  the  right  directions. 

(2)  y  .  -*■  0  as  1  -*  00  .  This  is  to  assure  that  c.  calculated  from  algorithm 

J  — j 

(3.2-10)  will  converge  on  some  specific  value.  Suppose  we  let  the  measured 
error  gradient  be  ^cQ(x|c)  and  tie  averaged  gradient  be  E{VcQ(x|c)}- 
Then 


’cQ<*|c.j>  -  E{VcQ (oc| c.^)}  +  L2  1  i  •  1.2,... 


(3.2-16) 


where  Z,  is  a  zero-mean  random  variable. 

■^3 

Thus,  V  Q(xlc.)  is  not  necessarily  zero  even  if  c.  =  c.  .  If  the  condi- 
c - j  — j  -op 

tion  -*■  0  as  j  •*  00  is  satisfied,  the  random  fluctuation  are  reduced  to 

zero  as  j  -  ”  ,  which  permits  to  converge. 

(3)  1^  Yj  <  »  °r  jfT  ^  0  as  J  -*  00  - 

Thi3  condition  is  needed  to  account  for  the  cumulative  effect  of  the  fluctua¬ 
tion  .  If  Eq .  (3.2-16)  is  substituted  in  Eq.  (3.2-10),  there  results 


Vi '  £j  =  ■  yj  ■  Yj 


(3.2-17) 


Summing  the  above  from  j  =  J  upward  gives 


‘  '  jlj  ^  -  jlj  k, 


(3.2-18) 


B-42 


♦ 


which  expresses  the  total  variation  in  £  from  the  Jth  step  onward.  The 
variance  of  the  random  part  of  this  variation  la 


(3.2-19) 


It  is  assumed  that  observations  on  Q(jcjc)  are  taken  sufficiently  far  apart  In 
time  so  that  the  are  independent. 

Hence  the  righ-hand  side  of  (3.2-19)  becomes  ,  IT  y?  {£?} 

3“J  3 

Assuming  that  E[£^]  <,  Ef^]  for  all  J  i  J  i"  , 


E  1  (jEj  Yj  vV  jL  Y* 


(3.2-20) 


CO  ry 

Hence  the  requirement  2  y-  ■+  0  assures  that  the  variance  and  the  total  random 

j*J  J 

variation  approach  zero  as  J  -*•  “ 


(A)  jij  yj  -  -  . 

Conditions  (1)  through  (3)  assure  only  that  converges  to  some  value 

c_  .  Condition  (4)  assures  that  c  =  c  .  This  follows  from  Eq.  (3.2-18). 

— ®  '  — oo  — op  n 

Taking  expectations  on  both  sides  yields 


Etc,  -  Cj]  -  “  jIJ  Yj  E{7cQ(x|cj)}  (3.2-21) 

Then,  since  condition  (4)  implies  .2_  y  =  ®  ,  if  c.  approaches  any  value  other 

J=J  J 

than  ,  E{VcQ(x|c_^) }  will  not  be  zero  for  any  j  >  J  and  therefore  the  total 

corrective  effort  ,2.  y.  E{v  Q  (sc  |  c_ . )}  becomes  Infinite. 

J=J  J  c  J 

The  above  four  conditions  state  that  the  rate  with  which  y^  decreases  must 
be  such  that,  on  the  one  hand,  the  variance  of  performance  index  vanishes,  and  on 
the  other  hand,  the  variation  in  y  over  the  variation  period  is  large  enough 
for  the  law  of  large  numbers  to  hold. 


B-43 


B-44 


One  would  conclude  therefore  that  the  same  which  eliminates  the  error 

caused  by  the  difference  between  (VQ^  +  VQ2)  an<*  +  VQ2)  would  also 

eliminate  thac  caused  by  the  difference  between  (VQ^  +  7Q2>  and  +  VQ2). 

This  is  stated  precisely  in  Theorem  2. 

Theorem  2  :  Let  ,  y2>...  be  a  sequence  of  positive  numbers  such  that 


«  j12,j 


0  ’  3-1  y3 


00  0 

■  1 < 


(3.2-25) 


Let  the  following  conditions  be  satisfied 


(B) 


e  < 


inf  (c-c  )  E{V  Q.  +  V  Q.>  >  0 

- “OD  Cl  c  z 

||c  -  c  ||  <  1 

1  -pp  1  e 


(3*2-26) 


(C)  E{(7cQ1  +  VQ2)T  (V^  +  7Q2)}  <  d(Jp  Cjp  +  cT  c) 


(3.2-27) 


where  e  >  0  ,  d  >  0  . 
Then  the  algorithm 


£.j+1  -  £j  ~  Yj  (VQX  +  VQ2) 


(3.2-28) 


minimizing 


E(Q1(e)  +  Q2(e)} 


converges  with  probability  one  to  c 

—op 

Proof  ; 

Subtracting  c  from  both  sides  of  Eq.  (3.2-28) 

—op 


%1  “  ^P  '  %  *  ^op  '  Yj  <VQ1  +  7Q2) 


(3.2-29) 


and  taking  the  inner  product,  we  have 


(^j+l  '  W  (£j+l  "  £op> 


%  “  V  ’  V 


2  YjCCj  -  cop>T<vQi  +  VQ2>  +  ^j(VQl  +  vQ2)T(7Q1  +  VQ2} 


(3.2-30) 


B-45 


Taking  the  conditional  mathematical  expectation  for  given  c^,  c^**"* 


E{^-j+l  “  -^op^2  -2 . 


“  I  i£j  *  £opl  I2  -  2  -  VT  E(7Q1  +  7Q2} 

+  >2  E{(VQ1  +  vq2)t(vq1  +  vq2)> 


Cj  yields 


(3.2-31) 


From  condition  (c)  ,  Eq.  (3.2-31)  becomes 


Eflk,.,  -  C_||2  1^,  £2>...,  £_,} 


-j+1  -op' 


I  I— j  -  £opl I  -  2  YjtCj  -  <^p>  E{VQl  +  yq2) 


+  7?  q(c 


+  C  c) 


3  -tip  -op - 

Using  condition  (B) ,  the  above  reduces  to 

12 


(3.2-32) 


E(  I  |c, . .  -  c 
'  '-j+1  -op1 


I  »  £^2  c  i } 


<  !  I  Cj  -  c 


-op 


uyj  c,p 


(3.2-33) 


From  this  point  on,  we  can  follow  in  exactly  the  same  manner  the  steps  leading 
from  (B-9)  to  (B-18)  in  Appendix  B. 


3. 3  The  Design  of  Adaptive  Tapped-Delay-Llne  Filters 

The  adaptive  algorithms  used  here  are  derived  from  the  methods  of  stochastic 
approximation  stated  in  the  previous  sections.  The  quality  criterion  may  be 
represented  In  the  form  of  the  mathematical  expectation  of  some  strictly  convex 
function  of  the  deviation  of  the  output  variation  from  the  desired  function.  For 
simplicity  we  shall  use  the  mean-squared  criterion.  Thus 

I  (c)  =  E{Q(d  -  z) }  with  0(e)  -  e^  (3.3-1) 

For  the  tapped-delay-line  filter  shown  schematically  in  Fig.  4  we  know 


x  (t)  *  8^(0  +  n^(t) 


(3.3-2) 


B-46 


It  is  assumed  for  the  moment  that  these  functions  are  stationary  random  processes. 
Nonstationary  or  time-varying  systems  will  be  considered  in  a  later  section. 

As  seen  from  Fig.  A  and  the  definition  of  d(t)  given  in  Sect.  2.1  the  error 
function  is 

e(t)  -  d(t)  ■  d  -  WTrj_  (3.3-3) 

and  its  square  is 

Q(e)  -  e2  -  d2  -  2  dnTW  +  WT  a  W  (3,3-A) 

The  gradient  of  Q  with  respect  to  the  weights  becomes 

Vw0(e)  -  2  da  +  2  a  aT  W  -  -  2  e  a  (3,3*5) 

Upon  using  algorithm  (3,2-10),  the  adaptive  scheme  to  adjust  the  weights  obtained 
as 

Hj+i  -  Wj  +  2  Yj  IL,  (dj  -  *,)  (3-3-6) 

The  above  adjustment  procedure  requires  the  availability  of  the  error 
fuuctlon  as  a  real  time  function.  This  requirement  is  not  convenient  in  dealing 
with  communication  problems  such  as  filtering  and  detection,  and  it  must  therefore 
be  removed. 

This  is  done  by  rewriting  Q(e)  as 

Q(e)  =  [d(t)  -  z(t)]2  o  z2(t)  +  d2(t)  -  2  d(t)  z(t) 


m 

Z2(t)  +  d2(t)  -  2  d(t) 

K(M+1) 

k£i  wk  [5k(t)  +  vk(t)1 

(3.3-7) 

Let 

«1  * 

z2(t) 

(3.3-8) 

q2 

d2(t)  -  2  d(t)  z(t) 

(3.3-9) 

and  note 

7QX  - 

2zVz  ■  2  a  z 

(3.3-10) 

B-47 


(3-3-11) 


E{VQ2)  -  -  2E(d£)  4  -  2 


where 


E 


i  d5l 

d(t)  s1(t) 

I  ds2 

d(t)  a2(t-A)  ' 

• 

d5M+l 

| 

-  E 

d(tj  s^(t-MA) 

j 

^  dEK(M+l) 

d(t)  sR(t-MA) 

i  the  above 

Rd(T) 

is  just  the  a< 

d(t)d(t+x1) 

d(t)d(t-»-tl-a) 


dCtidtt+r^MA) 


d(t)d(t+rR-MA) 


VT1> 

VTr&> 


|Rd(TK-MA) 

V  ✓ 


(3.3-12) 


signal  d(t).  For  any  given  number  of  taps  and  their  spacings,  together  with  the 
known  signal  direction,  can  be  completely  specified  if  Rd(T)  is  given. 

Substituting  (3.3-10)  and  (3.3-11)  into  (3.2-20),  we  obtain  the  desired 
algorithm  to  adjust  the  weight  vector 

Wj+1  -  -  2  Yj  zj  Hj  +  2  Yj  (3.3-13) 

During  the  training  period,  the  information  required  to  adjust  the  weights 
is  just  the  signal  autocorrelation  function,  z  and  n.  are  available  as  real  time 
functions.  Algorithm  (3.3-13)  will  be  used  extensively  in  designing  an  array 
processor.  Its  convergence  properties  are  given  in  the  next  two  sections.  The 
implementation  of  this  adaptive  mechanism  is  very  simple. 


Vt) 


Rd(T1~kA) 


l  »  (i-l)M+k 


A  rather  detailed  structure  is  shown  schematically  in  Fig.  7. 


B-48 


iteiawi  !#•)*«»** 


B-49 


Figure  7.  Adaptive  Mechanism 


la  comparing  the  formula  for  the  optimum  gains 


W  -  (R,  +  R  )”X  R.p 
—op  —£  — v  -d£ 


with  the  recursive  procedure 


Vi 


h  *  2  Sd£ 


-  2 


Zj  h 


we  see  several  advantages  in  using  adaptive  tapped-delay-line  filters  over  non- 
adaptive  tapped-delay-line  filters: 

(1)  No  ncise  field  measurements  are  required  since  the  weights  are  adjusted  in 
the  presence  of  nomu.il  hydrophone  outputs. 

(2)  No  solutions  of  simultaneous  equations  for  the  weights  are  required. 

(3)  When  the  signal  correlation  functions  are  used  in  (3.3-13)  the  difficulty 
of  generating  some  simulated  signals  as  proposed  by  Widrow  and  et  al  [16]  is 
completely  removed. 

('♦)  It  is  not  necessary  to  assume  that  the  undesired  interferences  originate  from 
point  sources.  The  noise  can  take  any  realistic  forms. 


3.4  Physical  Interpretations  of  the  Convergence  Conditions 
It  has  been  shown  that  algorithm  (3.3-13) 

Vl  ’  ‘  2  yi  %  +  2  Vj  Rjt 

is  derived  f^om  (3.2-20)  with  replaced  by  W  ,  i.e.. 


(3.4-1) 


^  -  yj(VQl  +  VQZ)  (3.4-2) 

Although  (3.4-2)  converges  both  in  mean  square  and  in  probability  under  certain 
mathematical  conditions;  it  is  not  clear  whether  these  conditions  can  be  met  in 
reality.  These  conditions  are  repeated  here  for  convenience. 


(A)  lira  Yj  -  0  ,  Yj  “  “  »  Yj  <  »  ,  Yj  >  0 


B-50 


(B) 


Inf 


e  <  | |W  -  W  1 1  <  — 

1  —op 1  1  e 


<M  -  Wop)T  Vw(7Qx  +  VQ2)  >  0 


(C)  E{(VQX  +  VQ2r  (VQX  +  VQ2)} 

<  d  (W  T  W  +  WT  W)  ,  d  >  0 
—op  —op  —  — 

The  choice  of  y^  which  satisfies  (A)  is  rather  at  our  own  disposal.  We  can 
always  set  y j  -  ^  where  y  >  0  and  1/2  <  a  <  1  to  fulfill  the  require¬ 
ments  of  (A) .  The  remaining  conditions  depend  on  the  surface  of  the  error 
gradient v  which  in  turn  depends  on  the  choice  of  error  criterion  and  the  physical 
system  under  consideration. 

We  shall  show  in  the  fallowing  two  lemmas  that  conditions  (B)  and  (C)  are 
satisfied  if  (1)  the  error  function  Q(e)  is  strictly  convex;  (2)  the  second 
i  .-.vative  of  Q(e)  with  respect  to  e  exists  and  is  uniformly  bounded;  (3)  all 
signals  (useful  signal,  ambient  noise  and  interferences)  are  generated  from  phys¬ 
ically  realizable  sources  and  thus  their  second  order  statistics  are  uniformly 

bounded.  The  first  two  conditions  are  definitely  satisfied  because  the  perform- 

2 

ance  criterion  employed  here  is  just  the  mean  squared  error  so  that  Q(e)  *  e  , 
which  is  strictly  convex  and  S^Q/Se4  «  2  is  uniformly  bounded.  The  third  condi¬ 
tion  concerning  the  boundedness  of  the  correlation  functions  of  the  input  proc¬ 
esses  is  also  satisfied  in  most  practical  situations. 

Consequently,  we  can  conclude  that  all  the  convergence  conditions  can  be  met 
In  practice  and  the  adaptive  schemes  should  be  operative  in  adjusting  the  weights 
on  the  tapped-delay  lines . 


Lemma  1  :  For  the  tapped  delay  line  filters,  if  Q(e) 


then  at  the  neighbor¬ 


hood  of  W  minimizing  E(Q(e)}  the  following  statement  is  true: 
—op 

inf  (w  -  w  )T  E{vq1  +  VQ-}  >  0 

■.  —  —op  ±  i. 

E  <  I  |w  -  W  1  I  <  — 

f  >  0 


B-51 


we  can  write  for 


Proof:  Since  I  -  E[Q]  has  a  minimum  at  W  ■  W  , 

-  —  -op 

k  -  1.  2,  ...  K(Wil) 


|S-  i  0  for  «  i  «<W 
3w^  <  k  <  dp 


(3.4-3) 


(w,  -  w<«)  -P—  J  0  for  all  k 

k  op 


(3.4-4) 


(W  *  W  )  E{VQ.  +  VQ,>  >  0 

— op  X  Z 


c  <  W  -  W  <  — 

*  — Op 1  1  G 


(3.4-5) 


Lemma  2  : 

2 

Let  Q  -  Qj,  +  Q2  -  e  .  If  the  second  order  statistics  of  the  input 
processes  are  hounded,  then  for  the  tapped-delay-line  filters  under  study  the 
following  condition  is  always  satisfied: 

E{(PQ1  +  ?Q2)T(7Q1  +  iq2)}  <  kx  (t£p  ^  +  WT  W),  kx  >  0 


Proof 


Using  a  Taylor  series  expansion  about  W  =  ,  we  have 


VQ(W)  -  ?Q(W) 


+  J  (W  -  W  ) 

—  —  —op 

W  *  W  W  =  W 

—  —op  —  —op 


(3.4-6) 


where  J  is  the  Jacobian  having  elements 


Jik  '  1^3-  ‘  *■  k  ‘  J’2 . «"*» 


(3.4-7) 


Since  the  error  function  is 


K(M+1) 

e(t)  =  d(t)  -  z(t)  =  d(t)  -  v£,  wfc  nfc(t) 


(3.4-8) 


we  see  that 


^  .  20.  i£_  ,  ia  [_  „  (t)) 

■'V,  ffc  3w,  Je  k 


(3.4-9) 


B-52 


and 


(3.4-10) 


hr  m  72  \(t)  *  2  \(t:) 


Therefore,  the  averaged  value  of  the  error  gradient  can  be  written  in  the  follow¬ 
ing  form  in  view  of  the  above  expressions 


VQ-VQ  +2R(W  —  W  )  *  2  R  (W  -  W  ) 

w  -  w  ~  “°P  ~n  -  -op 

—  —op 


(3.4-11) 


because  VQ  -  0  at  the  optimum  point  and  R^  is  the  input  correlation  matrix 
with  elements  ri^COn^t)  for  k,  l  »  1,2,...,  K(M+1). 

Note  that 


E{(VQ1  +  7Q2)  (7Ql  +  7Q2)} 

<.  EiV1^  +  VTQ2)  E{7Q1  +  9Q2) 


and  for  real  variables 

a2  +  b2  >  -  2  ab  from  (a  +  b)2  >  0 
(a  -  b)2  «  a2  +  b2  <  2 (a2  +  b2) 


(3.4-12) 


(3.4-13) 


The  desired  result  is  obtained  by  substituting  Eq.  (3.4-12)  into  Eq. (3.4-11) 
and  setting  a  constant 


k,  *  sup  | n,  (t)n. (t) | 

all  k,2. 


(3.4-14) 


for  all  k  and  2  .  The  constant  k;  defined  above  will  be  bounded  if  the  second 
order  statistics  of  the  Input  processes  are  bounded. 

3.5  Convergence  Properties  of  the  Adaptive  Tapped-Delay-Llne  Filters 

Raving  found  an  algorithm  which  converges  in  some  sense,  we  shall  now 
investigate  how  fast  it  converges.  In  other  words,  we  would  like  to  know  how 
fast  the  parameters  approach  to  their  values  and  the  tnean-squared-error  at  each 
stage  during  the  adaptation  period.  The  effect  of  the  input  statistics  on  the 


B-53 


rate  of  convergence  will  be  determined.  The  adaptive  behavior  adjusted  In  the 
presence  or  absence  of  the  target  signal  will  also  be  studied. 

Rewrite  algorithm  (3.3-13)  here  for  convenience 


Vl  •  +  2  V  '  2  *j 


Since  the  summer  output  is 


2j  ”  -1  ^ 


we  have 


Vl  -  Cl  -  2  ^  %>  Wj  +  2  Tj  V 


(3.5-1) 


(3.5-2) 


(3.5-3) 


Taking  the  mathematical  expectation  of  (3.5-3)  and  diagonizlng  the  input  correla¬ 
tion  matrix  R  such  that 
~-r, 

%  "  l1  £  l  (3.5-4) 

we  obtain 

E[Wj+1}  -  (1  -  21^  P_1  A  P)  EtMj 1  +  Zy,  (3.5-5) 

whore  P  Is  an  orthonormal  matrix  and  a  Is  the  corresponding  eigenvalue  matrix. 
Some  comments  arc  in  order 

1)  In  Eq.  (3,5-4)  the  input  correlation  matrix  assumes  different  values 
depending  upon  whether  the  input  contains  noise  only  or  signal  plus  noise.  When 
both  the  target  signal  and  the  undesired  noise  are  present,  the  output  of  the  i^ 

hydrophone  is 


XjU)  =  st(t)  +  ni(t)  (3.5-6) 

so  that  the  various  delayed  inputs  nk(t)  contain  signal  components  £k(t)  as 
well  as  noise  components  vk(t) 


->k(t)  -  ek(t)  +  vk(t)  ,  k  ■  1,  2,...,  K(Hfl) 


(3.5-7) 


B-54 


and 


a  z  “  <i  +  v.)  (I  +  Z>T  W  (3.5-8) 

E[n  z]  -  E(i.iT  +  v  VT]  E[w]  *  (R.  +  R  )  E[W]  (3.5-9) 

and  R^  are  the  Input  signal  correlation  and  input  r.oise  correlation 
matrices.  Thus,  it  is  important  to  keep  in  mind  that 


ir,  “  \  +  when  “  ^(t)  +  n±(t>  (3.5-10) 

and 

Sf,  “  j*,  when  **(*>  -  n±(t)  (3.5-11) 

2)  In  taking  the  average  over  Eq.  (3.5-8)  it  is  assumed  that  W  is 
statistically  Independent  of  n_  .  Although  W  cannot  affect  rj_  In  any  manner, 
the  increment  of  W  at  each  stage  is  related  to  r^by  Eq.(3.5-1).  Since  the  incre¬ 
ment  is  generally  very  small  and  the  total  effect  Involves  addition  of  a  large 
number  of  small  increments,  we  can  assume  E[n_  IfW]  -  E[n  n_T]  E[W]  in  a  manner 
similar  to  that  used  In  the  analysis  of  phase-locked  loops1. 

Thus  for  large  j  (at  later  stages  during  the  training  period)  there  should  be 

little  correlation  between  W.  ,  and  n  , ,  . 

— j+l  -^j+l 

In  returning  to  Eq.  (3.5-5),  let  us  define  a  new  weight  vector 
»'  »f«  (3.5-12) 

and  a  new  delayed  input  vector 

n_*  -  P  n  (3.5-13) 

2 

Since  R,  ■  R  W  as  seen  from  Eq.  (2.3-25), 

■■of,  —n  —op 

we  transform  Eq.  (3.5-5)  into 

E[Wj+1)-  (1  “  A)  E[W’l  +  2^  *  (3.5-14) 


See  A.  J.  Viterbi,  Principles  of  Coherent  Communication 
McGraw  Hill  Book  Co.,  N.  Y. ,  1966. 


'The  optimum  weight  vector  W  assumes 
training  environment.  ~°p 


R  1  R)r  or  R  1  R 
— n  — dC  — v  —d£ 


depending  on  the 


B-55 


E{W.+,1  -  -  d  -  2y  A)  (E(W!  ]  -  W*  ) 

— 3+1  ^>p  j  ~3  ^op 


(3.5-15) 


Now  consider  any  particular  component  of  W'  ,  and  for  clarity  no  subscript 
or  superscript  denoting  the  component  is  used.  Then  we  obtain  a  difference 
equation  for  E[wj]  “  wj 


wj+l  ~  wop  =  (l  "  2yj  X)  <wj  “  "op) 


(3.5-16) 


whose  solution  is 


(w;  ~  w'  )  .  n,  (1  -  2y  \)  +  w' 

1  op  k*l  j  op 


(3.5-17) 


We  shall  now  calculate  the  mean  square  of  the  weights. 

If  we  first  take  the  outer  product  and  then  the  mathematical  expectation  on 
both  sides  of  Eq.  (3.5-1),  we  can  write  after  some  algebraic  manipulations'll?] 

w,  .  wL  =  w  wT  +  4  y  (R  (W  -  W)WTK 

~]+i  3+1  J  3  3  ~n  -op - j 


L  .  2  2  T 

+  4  Y.  e.  n,  n, 

i  HI 


(3.5-18) 


where  {A}  denotes  the  symmetric  part  of  matrix  A  and 
{A  B)S  -  ~  (A  PT  +  B  AT) 

For  large  j  ,  the  following  approximation  can  be  made 


e ,  n.  n,  =  e,  n,  n, 
J  “j  H  min  -3  -j 


e2,  R 
min  — n 


(3.5-19) 


which  can  be  viewed  as  a  Taylor  series  expansion  around  the  optimum  point  and 
with  higher  order  terms  neglected. 


This  Is  done  by  combining  the  following  steps: 

"j+l  Hjn  -  “J+1  V  +  2  ^  +  *  h  'j  %  a? 


b.  e  (W  nj  +  n,  W*)  -  2  {R  (W  -  W  )  wV 
3  -j  -j  H  ~3  ~®P  3  3 


c.  (Is  is  used  to  make  the  expression  compact  and  the  superscript  can  be 
removed  in  dealing  with  diagonal  terms  of  a  square  matrix. 


B-56 


Therefore,  Eq.  (3.5-18)  becomes 


V  4«  ■  4  -  *  Yj  («,  -  WjwV 


,  .  2  2 
+  4  y,  e  .  R 
j  min  -n 


(3.5-20) 

Using  the  transformation  W*  -  P  W  defined  in  Eq.  (3.5-12),  we  can  express  the 
diagonal  terms  of  the  above  matrix  as 


{WJHW5l}D  e  {^^T}D  +  <-  W’)W,T}? 


-  -op 


j 


.  .  2  2 
t  4  y.  e  . 

j  min  — 


(3.5-21) 


while  the  outer  product  of  Eq.  (3.5-14)  is 


-j+i  Hjrt  -  ‘O  - 4  y,  -  sjT>*  +  *  •»!  Ke af 


(3.5-22) 


Let 


\Lj  -  (w ’  -  wj)<gj  -  w^)T  -  (w'  w,T)j  -  vq  WjT 


(3.5-23) 


Subtracting  the  diagonal  terms  of  Eq.  (3.5-21)  from  those  of  Eq.  (3.5-22) 
yields 


V  D 
-j+1 


2  —  2 

which  has  elements  of  the  form  (w!)  -  (w') 

J  3 

Thus,  for  any  particular  component  of  ,  we  have 

~3 


<.j+1>2  -  -  Vl 


-  (1  ■  4  y.  1)  v  +  4  y  2  X  e2. 

j  j  J  min 


(3.5-24) 


(3.5-25) 


Iterating  backward, 


J+1 


vi  ii  u  »  si.  ii  i  Li  a ' 4  (3-5-“’ 


B-57 


But  -  0  because  (w|)  *  w^  ,  so  thst 


vj+i  "  (wj+i  '  wj+i>2  "  4  X  emin  Ji  Yk  Jk+i(1  *  4  Ti  x> 


(3.5-27) 


Using  Eqs.  (3.5-17)  and  (3.5-27)  we  shall  find  the  rate  of  convergence  in  terms 
of  weight  variance  and  the  mean  squared  error. 

The  weight  variance  is 


W..,  -  w_ 


m»l  j+1  op 


(3.5-28) 


1  ^+1  -=op" 

and  the  mean  squared  error  at  each  stage  of  adaptation  Is  from  3q.  (2.3-28) 


e‘  -  eA  -  to.  -  W  )  '  R  (W.^,  -  W  ) 

j+1  min  — j+1  —op  — n  --j+1  —op 


<V-SoP>^  «j+i-V 


(3.5-2S> 


where  A  and  W'  are  defined  by  Eq.  (3.5-4)  and  Eq.  (3.5-12)  respectively. 

The  expected  difference  between  the  mean  squared  error  at  each  stage  during  the 
training  period  and  the  minimum  mean  squared  error  is  then 


E[ 


K(M+1) 


,«(m) 


-  e  ]  “  E  j  £.  X  (w!'7  -  w •'“')* 
min  l  n**l  m  j+1  op 


j+1 

!>'  (M+l )  - 7 — r - 7— r  -= 

V  X  (w!^  -  w'^V 

m“l  m  j+1  op 


Since 


<»i«  -  “7  '  <”J+i  -  ;J+i)2  *  ‘“J+i  -  “V2 


(3.5-30) 


(3.5-31) 


the  weight  variance  is 


I  w  -  w 

-j+1  -Op1 


K(M+1) 

E. 

m*l 


{  4  A 

1 


emin  k-1  Yk  «.Sk+l  (1  “  ^  i'V 


j 


+  Uwi(*)-w^))kni  (i- 


1  -  2  ^  >-)’2  ] 


(3.5-32) 


B-58 


and  the  mean  squared  error  becomes 


1  ~  .*?•“>.  „~2  3 


e  -  -  e  .  ■  E,  4  X2  e  .  y2  rtn.  (1-4  y  .X  ) 

j+1  min  m*l  m  min  k»l  k  £*k+l  fi  m 

K(M+1)  i  2 

+  Z.  X  (w!(m)  -  w'  (m))2  (jmi  -2  y.X  )) 
m**l  ml  op  '  k-1  k  m  ' 


(3.5-33) 


Eqa.  (3.5-32)  and  (3.5-33)  are  the  desired  expressions  for  the  rate  of  convergence. 
Although  they  appear  to  be  quite  complicated,  simpler  results  can  be  achieved  by 
setting  the  weighting  sequence  y^  to  some  special  forms. 


a)  y 


j  2  (j+3.)  X 


(3.5-34) 


th 


This  really  me ar  v  that  y^  is  a  diagonal  matrix  whose  m  element  is 

for  n  ■  1,2, . . ,K(M+1) .  Eq.  (3.5-34)  is  a  legitimate  choice  as  it 


2(j+l)Xfc 

satisfies  all  the  required  conditions  for  convergence. 
Since 

k*ld  2  Yk  a)  k=*l  (1  ~  k+l)  *  j+1 


(3.5-35) 


and  [see  (C-8)  of  Appendix  C] 

n  q  .  4  v  x)  a  n  (1  -  —■ — )  = 

E-k+1'  yT}  2«k+lU  2+l;  ^  j 

the  mean  squared  error  at  eec.h  stage  is 


(3.5-36) 


2  2 
ej+l  emin 


<j+D 


K(M+1) 


2  m=l 


2  ,  1 

e  ,  +  - 

min 


(j+1) 


<»;<■>- v  <”V 

2  m*l  ml  op 


K (M+l)  e2  -  J  ■■ 

mln  (j+1)2 


+  — <wi  ~  ”  >T  «  J 

(j+1)2  -1  _°p  ^ _1  ~°p 


(3.5-37) 


This  is  the  optimum  choice  of  y.  which  provides  the  fastest  rate  of  con- 

J  ~2 

vergence  and  can  be  derived  by  making  e^  a  minimum  for  each  j-1,2,...  .  Simula¬ 


tion  results  also  confirm  this  argument. 


'j 


B-59 


and  the  weight  variance  is 


-Ur 

(j+D 


-j  K(M+-1)  . 

e  I  — =* 

min  m«l  X 

m 


+ 


(j+i>2 


(3.5-38) 


The  significance  of  the  above  expressions  is  apparent. 

The  initial  error  at  tine  j  ■  1  is  Just  the  norm  of  the  difference  between  the 
initial  and  optimum  gains  in  the  parameter  space,  i.e.. 


e2 .  +  (W.  -  W  )T  R 

min  —1  —op  — r 


(«i 


w  ) 

-op 


(3.5-39) 


2 

The  initial  mean-aqua ted  error  will  decrease  at  the  rate  of  1/j  ,  j  being  the 

adaptation  time.  However,  the  first  term  on  the  right-hand  side  of  Eq.  (3.5-37) 
decreases  at  the  rate  of  j/(j+l)  -  1/j  and  will  definitely  dominate  the  first 

term  during  the  later  training  period.  It  is  proportional  to  the  total  number 
of  gains  being  adjusted  anc  the  absolutely  obtainable  mean-squared  error  using 
that  many  taps.  That  is,  for  large  j 


2 

£min 


(3.5-40) 


It  appears  that  more  training  time  would  be  required  to  make  approach  to 

2  2 
Cmln  ^  more  weights  [K(M+!)  in  number  ]  are  to  be  adjusted.  Actually,  em^n 

Is  a  monotone  decreasing  function  of  M  .  It  was  shown  in  Chapter  11  that 


2 

^mln 


2t 


i  QD 


.00 


K(m-i)  2 

j  H(u)  I  2  do;  -  I ,  w 
11  m51 1  m 


(3.5-41) 


where  H(m)  is  the  continuous  transfer  function  to  be  approximated.  Since  this 
quantity  does  not  decrease  linearly  with  M  ,  there  is  always  compromise  to  be 
made  between  the  training  time  and  the  accuracy  of  approximation.  Therefore  we 
should  keep  in  mind  that  using  too  many  taps  may  do  more  harm  (longer  training 


B-60 


period)  than  good  (smaller  mean-squared  error) .  That  the  mean  squared  error 

decreases  at  the  rate  of  — -  for  y,  **  (— )  is  a  well-known  fact  in  employing 

J  3  j“ 

the  methods  of  stochastic  approximation.  Here,  however,  we  have  shown  explicitly 
the  dependence  of  error  rate  on  system  parameters  such  as  the  number  of  hydro¬ 
phones  in  an  array,  number  of  taps  for  each  individual  filter,  and  the  input 
statistics.  As  a  simple  illustration,  suppose  that  we  want  to  reduce  the  error 
in  a  single  filter  to  about  1%  of  its  minimum 


,  2  2  w  2 
^Gj+1  emin  emin 


0.01 


then  for  100  taps  we  need  roughly  J  =  10,000  samples  to  adjust  the  weights  or 
equivalently  about  10  seconds  of  real-time  data  for  a  sampling  rate  of  1,000 
samples  per  second. 

The  time  required  to  make  this  same  adjustment  in  an  array  of  filters  need 
not  be  much  greater  since  the  adjustment  can  be  done  in  parallel  processors 
operating  simultaneously.  Parallel  processing  is  quite  feasible  here,  since  the 
basic  algorithm  is  so  simple. 


b)  Yj  “  Z(J+1)  (3.5-42) 

The  choice  of  y^  defined  in  Eq.  (3,5-34)  requires  some  a  priori  knowledge 
about  the  input  statistics.  If  the  noise  correlation  matrix  is  unknown,  an 
assumption  forcing  ue  to  apply  adaptive  techniques,  the  input  correlation  matrix 
and  thus  the  eigenvalues  cannot  be  determined.  Here  we  shall  consider  an  arbi¬ 
trary  sequence  y^  ■  l/2(j+l) . 

Since  (see  Appendix  C) 


r(i+2. -  \) _ 

( j+l) !  r(2-x) 


_ 1 _ 

r(2-A) (j+i)x 


for  j  >>  1  and  j  >>  a 


(3.5-43) 


B-61 


■>  2X 

n  (i  - 

i-k+1  K  j+l' 


Eq.  (3.5-33)  becomes 


(3+1)' 


(3.5-44) 


-5-  — r  K(Mfl)  (  X2  e2 


6  j+l  “  eoin 


K(H+1)  e*  J 

c.  -s-a|a  J,  <k+i)2xm 

ra  l  (j+l)  2Xm 


„•<*>  .  w.  (■)  2  3 

r  (  r(2-x  )0P  )  J 


c)  Yj  “  Y  “  constant 


0.5-45) 


which  o£  course  reduces  to  Eq.  (3.5-37)  when  X  -  1  for  m  -  1,2,...,  K(tt+1) . 

IQ 


(3.5-46; 


The  expressions  for  y^  defined  by  Eqs.  (3.5-34)  and  (3.5-42)  satisfy 
the  conditions  for  convergence  as  stated  in  Section  3.2.  In  these  cases  the 
Y * s  and  thus  the  gain  Increment  AW,  ■  W.^,  -  U  become  smaller  and  smaller 

-j  “j+l  -j 

as  time  j  proceeds  during  the  adaptation  period.  It  is  anticipated  that  the 
rate  of  convergence  will  be  Increased  if  a  small  constant  value  is  set  for  y  . 
As  shown  by  Comer  [42],  the  algorithm  with  constant  y  has  comparatively  little 
noise  resistance.  Furthermore,  in  the  presence  of  measuring  error  with  variance 
a2  ,  convergence  in  the  usual  sense  does  not  occur,  but 


lim  w 


j+l  =op 


W  I  1 2  <  P(y  ,  o2) 


(3.5-46) 


F(Y  ,  a2)  -*•  0  as  y  -*■  0 

We  shall  next  study  the  rate  of  convergence  when  y  Is  a  constant. 
From  Eq.  (3.5-17)  we  see  that  with  y^  ■  Y  =  constant 

*1+1  ‘  (wi  -  woP)  Ji  (1  -  2>x)  +  wop 


(3.5-47) 


-  (1  -  2y\)j  (w J  -  w'  )  +  w’ 

1  op  op 


(3.5-48) 


B-62 


Since 


a  +  ay  +  ay* 


We  can  obtain 


+  aY2  +  *  •  •  +  BY11**1  »  — 


j 


kSi  f1  -  4vX)_k  -  ^  [(1-  4yx)  VJ_*,-i] 


1  “  Y 


(3.5-49) 


(3.5-50) 


Thus , 


ii  >k  (1  -  4yx)  -  ii  y2  -  w 


j-k-i 


Y2<1  -  4yX)J_1  k|x  (1  -  4yX> 


-k 


Y2  471  11  -  (1  “  ^X)^1] 


(3.5-51) 


and  Eq.  (3.5-27)  becomes 


(WJ+ 1  ~  Wj+1)2  "  emin  Y  [1  -  (1  -  4yX)^1] 


(3.5-52) 


The  mean-squared  error  is  then 

-7-2-5  K(M+1) 

ef  ,  -  ez.  =  e  ^  y  1.  X  fl  -  (1  -  4yX  )J"X] 
j+1  min  min  m=l  m  m 


K(H+-1)  ,  .  ...  ,, 

+  J11  *  (wi  m  “  w’  <m>>2  -  2yX  )2j 

m=l  ml  op  m 

(3.5-53) 

It  is  seen  from  the  above  expression  that  if  the  error  is  to  decrease  at  all,  one 
basic  requirement  should  be  met,  i.e., 

0  <  1  -  4yX  <  1  with  y  >  0  (3.5-54) 

IB 

for  ra  -  1,  2,  ...,K(M+i) 

which  implies 

1 


0  <  y  < 


4X 


(3.5-55) 


H-on 


1  „„  i3  largest  eigenvalue  of  the  correlation  matrix  R  .  Thus 

uUlX  — -fj 

Y  •  constant  cannot  be  set  at  will  if  stability  of  the  adaptive  loop  is  to  he 
maintained. 

Conclusion 

The  adjustable  gains  under  the  operation  of  our  proposed  adaptive  3c'neine 
using  signal  correlation  functions  converge  to  two  different  sets  of  optimum 
values  depending  on  whether  the  input  contains  target  signal  or  not  during  the 
training  period,  i.e., 

lim  W.  -  (R  +  R  )_1  R  if  x  -  s  +  n 

-3  -v  ~  ~  ~ 

lim  W  -  R  1  n  if  x  ■  n  . 

—i  -v  —dr,  —  — 

<  -**x>  J  ^ 

J 

The  mean  squared  error  decreases  approximately  as  the  first  power  of  the  adapta¬ 
tion  time.  The  rate  of  convergence  is  essentially  indifferent  to  the  number  of 
weights  to  be  adjusted  as  our  algorithm  allows  simultaneous  adjustments.  The 
size  of  error,  on  the  other  hand,  does  depend  on  the  total  number  of  taps  and 
the  difference  between  the  initial  and  the  optimum  values  of  the  weights.  It 
is  also  of  importance  to  note  that  the  weighting  sequence  cannot  be  selected  at 
will.  Although  y ,  =  — ^  <  a  <_  1  satisfies  all  the  conditions  for  conver- 

-3  1 

gence  for  any  positive  constant  y  ,  this  constant  should  not  exceed  if 

max 

stability  of  the  adaptive  loop  is  to  be  maintained.  This  is  especially  important 
during  the  early  stages  of  adaptation.  Simulation  results  are  given  in  Chanter 
six. 

3 . 6  Further  Remarks  on  the  Operations  of  the  Proposed  System 
a)  Choice  of  the  Initial  Weights 

Although  the  adjustable  weights  can  be  set  to  any  values  at  the  beginning  of 


the  adaptive  process,  it  is  desirable  to  set  them  not  too  far  from  their  optimum 


by  using  whatever  Information  la  available  concerning  the  statistics  of  the  noise 
field.  The  formula  for  calculating  the  optimum  gains  can  be  utilized  to  start 
the  initial  computation  with  inaccurate  noise  statistics.  This  kind  of  choice 
will  shorten  the  adaptation  period  and  thus  reduce  the  cost  of  operation.  In 
cases  where  absolutely  no  such  information  is  known,  the  gains  associated  with 
the  input  delayed  by  (i  *  1,2,. .  .K.)  ate  set  to  1  and  the  rest  to  zero  so  that 
a  square  law  detector  is  used  at  the  starting  moments.  As  the  adaptive  proceeds, 
the  whole  system  will  gradually  be  transformed  into  an  optimum  one.  Any  target 
signal  not  detectable  during  the  early  stages  can  probably  be  ferretted  out  at  a 
later  time. 

b)  Problem  of  Signal  Suppression 

In  most  adaptive  detection  systems  such  as  those  of  Glaser  [18],  Jackowatz, 
etc.  [19]  more  errors  are  made  as  the  input  signal-to-noise  ratio  is  decreased. 

In  fact,  it  has  been  hypothesized  that  if  the  signal-to-noise  ratio  is  gradually 
decreased,  eventually  a  point  is  reached  where  instability  occurs,  with  conse¬ 
quent  breakdown  of  tr.e  system.  That  is,  for  signal-to-noise  ratios  below  a 
certain  level,  the  number  of  errors  degrades  the  quality  of  the  measurements  to 
the  extent  that  the  use  of  the  erroneous  measurements  by  the  detection,  results 
in  even  more  errors.  This,  in  turn,  causes  even  poorer  measurements,  and  so  on 
until  a  complete  collapse  of  the  system  performance  to  an  error  rate  of  one-half 
occurs.  In  our  system,  however,  adaptation  always  takes  place  regardless  of  the 
presence  of  the  target  signal.  It  is  therefore  reasonable  to  anticipate  that 
there  will  be  no  signal  suppression  phenomenon. 

c)  Problem  of  Uncertain  Signal  Pow?r 

In  designing  most  non-adaptive  optimum  detection  systems,  complete  statisti¬ 
cal  knowledge  is  required  for  both  the  signal  and  the  noises.  That  is,  their 
spectral  shapes  as  well  as  their  power  levels  are  assumed  to  be  known.  The 


B  -6  u 


proposed  adaptive  system  assumes  no  Information  about  the  noise  fields,  which 
represents  one  of  major  advantages  in  applying  iterative  procedures.  Although 
It  is  reasonable  to  assume  that  the  general  shape  of  signal  spectrum  is  known, 
signal  power  level  may  be  in  some  cases  uncertain  before  detection.  To  get  more 
insight  about  the  operation  of  the  proposed  system  one  may  ask  how  the  uncertain 
signal  power  affects  the  system  performance.  This  can  be  answered  by  studying 
what  algorithm  (3.5-1)  converges  to  if  signal  is  indeed  present  in  the  postulated 
direction  but  has  a  power  level  different  from  that  assumed. 

It  is  shown  in  Appendix  D  that  if  the  assumed  signal  power  differs  from  the 
actual  power  bv  a  multiplicative  constant,  the  gains  adjusted  according  to 
algorithm  (3.5-1)  will  converge  in  mean  as  well  as  in  mean  square  to  their  opti¬ 
mum  values  multiplied  by  the  same  constant.  Consequently,  the  asymptotic  struc¬ 
ture  of  the  proposed  system  will  differ  from  the  optimum  one  by  a  multiplicative 
constant  if  incorrect  signal  power  is  assumed  during  the  adaptation  period.  For 
a  fixed  threshold  the  detectability  of  the  detector,  in  terms  of  false  alarm  rate 
and  miss  probability,  will  be  degraded  to  an  extent  depending  on  how  the  constant 
deviates  from  unity.  The  threshold  should  be  adjusted  around  its  normal  operating 
level  as  a  function  of  the  signal  power.  If,  however,  some  kind  of  display  device 
is  available  as  in  most  practical  cases  to  observe  the  directivity  pattern  of  the 
array  system,  uncertain  signal  power  will  not  affect  the  sensitivity  of  the  pat¬ 
tern.  The  output  slgnal-to-noise  ratio  remains  essentially  unchanged. 


B-t>6 


CHAPTER  FOUR 


ADAP'  '’TON  IN  A  NONSTATIONARY  ENVIRONMENT 

4.1  Introduction 

In  Chapter  III  iterative  procedures  were  derived  which  involve  no  noise 
statistics  and  no  explicit  time-averaging.  It  is  generally  noted  that  the 
optimum  processor  can  do  significantly  better  than  the  conventional  processor 
for  highly  directional  noise.  However,  highly  directional  noise  fields  are 
likely  to  be  nonstationary.  For  example,  in  the  sonar  array  problem,  the  most 
likely  sources  of  directional  noise  or  interference  are  ships,  and  ships  may  be 
moving.  Under  this  situation  the  input  covariance  matrix  and  hence  the  optimc. 
gains  on  the  tapped-delay  lines  will  be  a  function  of  time.  It  is  obviously 
desirable  to  modify  the  algorithms  in  such  a  way  that  adaptation  can  still  be 
accomplished  in  a  nonstationary  environment.  Otherwise,  if  we  still  use  the 
same  algorithm  to  estimate  the  gains,  the  actual  optimum  point  in  the  parameter 
space  would  have  moved  to  some  other  place  before  a  steady  state  is  reached. 

This  is  a  very  important  problem  frequently  encountered  in  practice. 

In  this  chapter  we  shell  consider  several  partial  solutions  to  this  diffi¬ 
cult  problem.  These  solutions  are  partial  because  each  one  of  them  can  be  applied 
to  very  restrictive  cases  under  particular  assumptions.  If  the  law  governing  the 
parameter  variation  is  known  completely,  we  can  generalize  the  dynamic  stochastic 
approximation  method  [36]  to  adjust  the  time-varying  parameters.  In  case  the 
dynamics  of  parameter  variations  is  generated  by  a  special  mechanism  and  some 
pertinent  statistics  are  available,  we  can  then  apply  the  Kalman  filtering 
techniques  to  this  nonstationary  problem.  Cases  mostly  encountered  in  practice 
are  nevertheless  different  from  these  two.  We  cannot  expect  to  know  the  equation 
of  parameter  variction  exactly,  nor  do  we  have  complete  statistics.  If  all  the 
information  we  have  about  the  noise  fields  Is  the  rate  of  change,  we  shall  just 
use  the  ordinary  procedure  and  determine  the  effect  of  nonstationarity  on  its 


B-67 


convergence  properties . 

4.2  Application  of  the  Method  of  Dynamic  Stochastic  Approximation 

When  the  random  environment  is  nonstationary  with  time-varying  statistics, 
the  optimum  parameter  set  6_  *  becomes  a  function  of  time  index  j  .  Its 

value  at  time  j  will  be  denoted  by  6.  .  It  is  assumed  here  that  the  law 

“j 

governing  the  variation  is  known,  although  the  sequence  to  be  estimated  is  un¬ 
known.  In  this  case  the  generalized  dynamic  stochastic  approximation  method 
developed  in  Appendix  E  can  be  applied  to  the  design  of  adaptive  tapped-delay 
line  filters.  If  the  variation  of  6_  is  governed  by  a  known  operator  L  such 


that 


L(9_^  ,)  then  the  desired  algorithm  is  given  by  (Eq.  E-19) 

>J 


Vi  “  -  YJ  VQ%'V 


(4.2-1) 


where  the  C's  are  replaced  by  the  W's.  Upon  using  (4.2-1)  the  adjustment  pro¬ 
cedure  for  our  delay  line  filters  becomes 

(4.2-2; 


Vi "  L«J.J)  +  V '  2yj  *1  % 


The  above  formulation  is  restricted  to  the  case  where  the  dynamics  of  the  opti¬ 
mum  set  6_  are  described  by  a  homogeneous  difference  equation 

(A. 2-3) 

,  not  necessarily  linear,  may  be  assumed  to  be  a  state  transition 


Vl  *  “j+i.d  h 


where 

matrix  (if  9,  is  treated  as  the  state  of  the  system  at  t  )  with  the  properties 

J 

L.  .  -  I  «  unity  matrix  for  all  j 
"TJ.  j  - 

-  L. 


“k.1 


and 


-iu  ■  ij.k 

Thus,  if  the  law  governing  the  variation  of  the  optimum  set  is  completely  known, 
we  can  always  take  the  time-varying  effect  into  account  and  adjust  the  parameters 


B-68 


systematically  to  their  optimum  state.  However,  in  most  practical  cases  such  as 
sonar  detection  problems,  this  variation  is  random  due  to  the  random  nature  of  the 
unpredictable  environments  like  thermal  noise,  surface  agitation  flow  noise,  cavi¬ 
tation  noise,  moving  interferences,  etc.  Then  the  time-varying  trqnd  can  only  be 
described  statistically  by  its  measured  or  estimated  frequency  response  or  spec¬ 
trum. 

It  seems  likely  that  the  well-known  Wiener  prediction  theory  can  be  applied 
to  estimate  the  variation.  But  there  is  a  serious  drawback  in  using  this  theory. 
Since  Vl  is  estimated  from  6^  and  possibly  other  previous  states,  the  prob¬ 
lem  at  hand  is  similar  to  that  of  a  random  walk.  The  estimation  error  at  each 
step  may  be  small,  but  the  accumulated  error  can  be  (not  necessarily)  very  large. 
Convergence  in  mean  square  or  in  probability  is  not  assured.  In  the  next  section 
we  shall  modify  the  sequency  such  that  yj  -  Yj  +  3^  in  using  the  ordinary 

stochastic  approximation  method.  The  sequence  y  satisfies  the  usual  conditions 

v  i 

Tj"  i  j  <  0  1  ^  and  is  used  to  correct  the  time-varying  effect.  As  a 

limit  y  -+•  0  when  j  -*■  00  ,  but  6.,  will  converge  to  a  small  constant.  Since 
the  optimum  set  is  always  moving,  some  adjustments  should  be  made  at  all  times. 

4. 3  Application  of  the  Kalman  Filtering  Techniques 

It  has  been  shown  in  our  previous  developments  that  the  adaptive  tapped-delay 
line  filters  designed  via  the  methods  of  stochastic  approximation  using  the  current 
input  information  can  asymptotically  converge  to  the  Wiener  filters.  Since  Wiener 
filters  are  designed  for  stationary  processes  and  their  extensions  to  the  time- 
varying  case  are  the  Kalmar,  filters,  we  shall  apply  the  Kalman  filtering  techniques 
to  the  design  of  adaptive  tapped-delay-line  filters  with  the  hope  that  more  rapidly 
convergent  algorithms  can  be  obtained  and  at  the  same  time  adaptation  in  nonsta¬ 
tionary  environments  can  be  achieved.  Consider  a  discrete  filter  consisting  of 
tapped-delay-lines  and  K(M  +  1)  adjustable  gains  W  .  Let  A  be  the  tap  spacing 


U-69 


B-70 


The  matrix  is  assumed  to  be  non-negative-definite  unless  otherwise  stated. 


Let  the  optimum  gains  be  generated  by  a  source  described  by  a  first  order  dynamic 
1 

system 


% 


a  Vi +  Vi 


(4.3-8) 


where  the  constant  0  <  a  <  1 


is  a  scalar  and 


^-1 


is  a  vector  random  sequence 


with  known  statistics 


E[u.]  »  0  (4.3-9) 

■£]  *  fit  <jk  M.3-10) 


The  matrix  is  assumed  to  be  nonnegative-definite. 


*3 


j 

It  is  also  assumed  that  the  random  sequences 


so  it  is  possible  that 
v.  and  u.  are  uncorre- 


lated. 

Employing  the  well-known  Kalman  filtering  techniques  (a  summary  is  given  in 
Appendix  F) ,  the  estimate  of  the  optimum  gains  at  stage  (j  +  1)  can  be  calcu¬ 
lated  from  its  previous  value  by 


-j+1  ‘  -i  +  (dJ  '  a  ^ 


*1 


£j  ZLj  lj 


C  -  (a2  *J-i  +  Vi^  +  4 


(4.3-lla) 

(4.3-llb) 

(4.3-llc) 


It  is  to  be  noted  that  Eq.  (4.3-8)  is  a  rather  particular  model  for  which 
the  results  presented  in  this  section  hold  true.  This  assumption  makes 
the  present  approach  a  partial  solution.  For  slowly  varying  parameters. 


the  variance  of  the  random  sequence 
small  and  permits  us  to  assume  a  =  1 


Uj  ,  var  [uj 


(1-a  )  var  [6] 


B-7  l 


(4.3-llc)  can  be  rewritten  in  a  more  convenient  form  for  computational  purpose 
Pj  -  a2  +  Qj_1  -  (a2  +  Qj_1) 


(4.3-llc) 


P  is  a  K(H+1)  x  K(M+1) -dimensional  matrix. 

Since  we  are  at  liberty  to  process  new  data  only  one  at  a  time,  and  for  slowly 
time-varying  case  a  3  1  [37],  we  arrive  at  the  following  simpler  formulation 
for  the  iteration  process 


~j+l  ”  -j  +  — J+l  -j+l^J+l  ~  ^+1  ^ 


(4.3-12a) 


r  »  -  p 

-j+l  <•>  -j+1 


(4.3-12b) 


-j+l  j  +  <D  +  $  ij+i 


(4.3-12c) 


-3+1  “J 


(P,  +  ql)r  nt  . (P.  +  ql) 
f  +  ^  _  -  ~1 . .TT-Zl+i  rj+i_.^3 _ _ 

_J  i  *  +  +  <D  jl,+1  l 


(4,3-12c) 


where 


Var  (v^) 
Var  (Uj) 


(4.3-13) 

(4.3-14) 


for  stationary  white  random  sequences  {v^}  and  (Uj}*  Algorithm  (4.3-12)  will 
be  discussed  for  both  stationary  and  nonstationary  cases.  Its  relationship  to 
the  method  of  stochastic  approximation  will  also  be  given. 


a)  Stationary  Case 

In  the  stationary  case  the  optimum  gains  are  time  invariant  so  Chat 


B-72 


for  all  j 


a  -  1  and  u 


reduces  to 


=  0  .  Suppose  that  4>  *  1  ,  (3.6-15) 


-j+1  '  ^  +  ^j+1  %+l  (dj+l  "  ^+1  (4.3-15a) 

—j+1  Fj  +  %+l  ^j+l  (4. 3-15b) 

Combining  Eq.  (4.3-15a)  and  (4.3-15P)  gives  an  alternative  expression 

Mj+1  “  H J  ^  (4.3-16) 

which  is  just  the  solution  of  minimizing  the  error  (4.3-4)  by  the  least  square 
fit.  The  relationship  between  optimal  filtering  and  least  square  fit  has  been 
->oir.ted  out  by  various  authors  [43,  44].  The  sequence  described  by  Eq. 


(4.3-16)  will  be  shown  to  converge  to  the  optimum  gains 

6  “  R-1  R,  -  R'1  R,r 
—  — n  -Tin  —q  --di 

ibis  is  done  by  rewriting  Eq.  (4.3-16)  in  the  form 


(4.3-17) 


rj+l  *  1  f  Sj1"1  1  Is!  ,2J 


(4.3-18) 


Applying  the  strong  law  of  large  numbers  and  making  use  of  the  fact  that  continu¬ 
ous  functions  of  convergent  random  sequences  are  also  convergent,  we  can  state 
that 


rl  ..T 


j 


H,  tjMj  it,J  -  li»  [J 

j  J  j-MI>  J 


-  «„  [nr,1])  1  -s;1 


(4. 3-19) 


*  This  corresponds  to  the  problem  of  minimizing  !  ;D_.-  H.  bj  | 2  instead  of 
m  -  | «  J  J 

I  jD.-  H  5-1  |  ,  used  in  Eq .  (F-7)  .  The  weighting  matriv  R  is  required 

J  J  J 

Rj 


is  required  if  the  measuring  noise  is  Gaussian  distributed  with  z._ro  mean 

and  covariance  matrix  R.  .  For  details,  see  [44]. 

~3 


11-73 


with  probability  one  if  R^  exists  and  is  positive  definite.  always  meets 

these  requirements  because  it  is  the  correlation  matrix  of  the  delayed  inputs. 

By  the  same  token  we  can  state 

(4.3-20) 


£!J*W-  £ 'j  ii  \  -  s* 


with  probability  one,  if  exists.  Consequently  one  concludes  that 

„  U  )  lia»(  Wj“  R-1  R.  -  e  \  -  1 
Prob  i  ~i  -n  -dn  “  J 


(4.3-21) 


and  the  limit  6_  minimlies  the  mean  squared  error  defined  by  Eq.  (4.3-2).  It 
is  noted  that  in  using  algorithm  (4.3-15)  the  initial  estimate  of  the  parameters 
V,  is  arbitrary  and  T  ^  is  finite  and  positive  definite.  We  can  just  aet 

•  0  and  -  I_  ■  unit  matrix  to  start  the  iteration  process.  The  connection 
between  algorithm  (4.3-15)  and  the  ordinary  method  of  stochastic  approximation 
an  be  constructed  as  follows: 

.  om  Eq.  (4.3-15b)  we  have 


-1  -1  T 

r,  »  r.  .  +  L,  n. 

-j  -d-1  -J 


-1  i  T 


i 


(4.3-22) 


_ 1  1  J  T 

C  - 


which  for  large  j  converges  to 
-1 


lim  rT1 


+  j  R  ~  i  R 

J  — n  —■ n 

Thus  the  weighting  matrix  appearing  in  F.q .  (4.3-15)  approaches  to 
lim 


(4.3-23) 


J- 


-  =  I  R"1 

‘3  J  ^ 


and  the  corresponding  algorithm  becomes 


-i+i  “  -j  +  j+i  ^  ^+itdj+i  "  V 


(4.3-24) 


B-7  4 


Eq.  (4,3-24)  is  Just  the  adjustment  procedure  derived  from  the  ordinary  method 
of  stochastic  approximation 


Hj+i  Hj  +  ij+i  Hj+i  (dj+i  "  Hj+1  Hj> 


(4.3-25) 


with 


1  n"1 

*j+l  *  j+1 


(4.3-26) 


Yj+1  is  considered  as  a  weighting  matrix  in  this  case  and  in  the  simplest  ca9e 


is  just  “  J+T  —  *  if  iL  is  a  diagonal  matrix  and  y *  constant.  In  the 


-1, 


transformed  parameter  space  where  W1  *  P  W  >  i  “P.  A>P_  the  optimum  weighting 


sequence  is  y 


(k) 


,  X,  being  the  kth  eigenvalue  of  R 


Explicit 


'j+1  (j+i)xk 

expressions  for  the  rate  of  convergence  have  been  derived  in  Section  3.5.  Since 
algorithms  (4,3-15)  and  (4.3-25)  minimize  the  same  quadratic  criterion  in  the 
limit,  it  will  be  of  interest  to  compare  the  convergence  properties  of  the  two 
algorithms.  It  is  to  be  expected  that  algorithm  (4.3-15)  will  be  more  rapidly 
convergent  than  algorithm  (4.3-25)  since  at  each  stage  of  the  iteration,  algo¬ 
rithm  (4.3-15)  uses  information  from  the  inputs  of  all  past  stages  whereas  algo¬ 
rithm  (4.3-25)  only  uses  information  from  the  input  that  is  received  at  the  current 
stage.  The  optimum  sequence  {y^}  defined  by  Eq.  (4.3-26)  can  be  predetermined 

only  when  we  know  the  cc rrelation  matrix  P  ,  which  contradicts  our  motivation 

— n 

of  using  adaptive  techniques.  That  algorithm  (4.3-15)  converges  faster  than 

algorithm  (4.3-25)  with  arbitrary  v,  =  — f —  ,  ■—  <  a  <  1  ,  can  be  further  illus- 

J  jO  2  - 

trated  by  comparing  the  methods  of  minimization  embodied  in  each  of  the  algorithms. 

Algorithm  (4.3-25)  proceeds  in  the  direction  of  the  negative  gradient  of 
T  2  th 

(d  -  W  )  at  the  j  iteration  stage.  Algorithm  (4.3-15),  on  the  other  hand, 

^  j  T  2 

is  a  second  order  algorithm  that  selects  which  minimizes  n^)  ]  at 

the  iteration  stage.  On  this  basis,  it  seems  plausible  that  algorithm 

(4.3-15)  should  be  more  rapidly  convergent  than  algorithm  (4.3-25),  a  conjecture 


B-75 


that,  will  be  reinforced  by  the  simulation  results  presented  In  Chapter  Six. 

The  requirement  chat  the  desired  filter  output  d(t)  be  available  can  be  re¬ 
moved  using  signal  correlation  functions. 

(  b)  Konstationery  Case 

In  the  general  case  where  the  ootlmua  gains  vary  with  time  as  a  result  of 
the  nonstationary  noise  fields,  algorithm  (4.3-i2>  will  have  to  be  used  to  make 
adaptation  possible  In  nonstationary  environments.  If  we  are  willing  to  adjust 
the  weighting  matrix  Lj+^  at  each  stage  during  the  adaptation  period,  algorithm 
(4.3-12)  is  the  desired  procedure.  If,  however,  we  just  want  to  modify  the  ordi¬ 
nary  method  of  stochastic  approximation  such  that 


%  ' 

j 


(4.3-27) 


where 


then  o  =*  constant  can  be  found  in  the  following  discussion  to  correct  the  time- 
varying  effect.  As  a  limit  -*  0  when  j  *  ,  but  adjustments  are  made  at 
all  time  due  to  the  presence  of  h  . 

Consider  the  adjustment  procedure 


u  =  v;  4-  3  n  ; 

Vi  -1  -j+1  -3+1 


_1  <Vi  -  v 


(4.3-28) 


where  the  weighting  matrix  £  in  the  stationary  case  is 

,,-l  _  .-1  +  1  _.T 

~j  +  l  r  j  :  — j+1  — j+1 

and  that  in  the  nonst at  ionary  case  Is 


(4.3-29) 


-,-l 

-j+1 


C 


-j+1 


-j+lJ 


-l 


(r 


+  ^1 


Following  the  arguments  lending  to  F.q. 


+  q  P"z  +~^+1  nj+1 

(4.3-23),  we  can  write  from  Eq. 


(4.3-30) 

(4.3-29) 


(4.3-31) 


n-7ti 


Since  Eq.  (4.3-30)  is  a  nonlinear  difference  equation  of  the  form 


VI  ’  {xj  +  a)_1  +  y3+i 

whose  explicit  solution  is  not  available,  we  shall  only  consider  the  asyraptotica 
behavior  of  Eq.  (4.3-30)  for  large  j  such  that 


r)+l  — j 

b*  7  ^+1  2j-fi  <<  ^+1  +  ^j+i^ 


—j+i  <<  — j+i 


If  the  above  assumptions  hold,  one  can  obtain  the  steady  state  il_  as 

lim  IT  -  q 
J-*" 

and  the  adjustment  constant  6  defined  in  Eq.  (4.3-27)  has  the  form 


(4. 3-32) 


(4.3-33) 


Therefore,  a  simple  iterative  procedure  tc  adjust  the  gains  in  nonstationary 


environment  is 


W...  -  u.  +  (r.  +  $  (d...  -  wT  n,  .) 

*4+i  'J  j  *  -j+i  j+i  -4  -4+1 


(4.3-34) 


The  quantity  ^j+1  aPPearinS  in  E<)  ■  3-34)  can  be  replaced  bv  the  signal 

correlation  function  R._  . 

—di 

Summary  -  Let  the  actual  target  wavefront  d^  and  the  summer  output 
T 

z,  “  r ,  W.  be  related  at  time  t,  by 

j  "4  ~j  J 


d.  =  ru  W .  +  v 

i  -4-3  j 


(4.3-35) 


wher<--  the  noise  v^  is  an  additive  random  sequence  with  known  statistics 


E[v^]  =0  ,  Var  (v^)  =  $  • 


*  See  Appendix  G  for  details 


3-77 


(A.3-36> 


tez  the  opti suss  gains  ^  described  by  a  first  order  dynamic  system 

*  n  £  +  U 

-J  -j-1  -J-l 

where  the  constant  0  <  a  <  1  is  a  scalar  and  u.  a  vector  random  sequence  with 

j 

T 

known  statistics  E[u^]-  0  ,  E[Uj  u^]  «  q  I.  The  algorithm  to  estimate  the  adjust¬ 
able  gains  W  which  minimizes  the  mean  squared  error  at  each  stage 


Jj  •  j  k£i  \  ai 


(A. 3-37) 


can  be  written  in  the  general  form 


W...  “  +  [*Lr  -  EL.  H,  WJ 

-J+l  “j  j  -dC  T  "Tl  ~j 


(A. 3- 38) 


The  weighting  sequence  r_^  assumes  different  expressions  depending  on  the  sta- 
tionarity  of  process  and  on  whether  an  optimum  estimation  procedure  (fastest 
convergence  rate)  is  required.  Four  different  choices  of  Zj+j  are  listed  below: 


^tationarity, 
jopt  inali'ty  -  | 


stationary 


.-1 


optimum  |  Lj  »  I_j_1  + 


nonstationary 


nonoptimum 


’  'i  1 


r;1-  <1^  +  nT1*  £  a,  aj 


V'V?1 


where  v .  =  - —  ,  y  >  0  ,  —  <  a  <  1  . 

3  j‘  2 


4,4  Nonatatlcnatity  and  the  use  of  Ordinary  Methods  of  Stochastic  Approximation 
a)  Notations 

Recall  that  the  adjustment  procedure  used  very  frequently  in  this  study  Is 


of  the  form 


W  .  =  W .  +  2 v  .  R  , ,  -  2y  .  z  .  n 


-1  +  1  -j 


j  -1?  'j  j 


(4.4-1) 


B-7  8 


which  !s  derived  fro®  the  ordinary  ssthod  of  stochastic  approximation 

V  V  !j VQ 

The  optimum  gains  for  the  multiple-sensor  array 
e  -  W  ■  (R  +  R  )-1  R. 

—  -Op  — £,  — v  —df. 


(4.4-2) 


(4.4-3) 


are  obtained  by  solving  E[7Qj  «  0  . 

In  order  to  illustrate  the  essential  steps  involved,  we  shall  consider  the 
simplest  case  where  only  a  single  filter  is  designed  by  adjusting  a  single 
gain  constant.  Extensions  to  the  general  case  1b  reasonably  straightforward. 
The  optimum  gain  in  this  case  is  then 


e  =  c._  *  [R  (0)  +  R  (0)  ]  1  R  (0) 
op  s  n  s 


(4.4-4) 


where  R  (t)  and  R  (t)  are  respectively  the  signal  and  noise  autocorrelation 


functions.  When  the  noise  field  is  time-varying  such  that 


Rn(T,t)  «  Rn(T)  [1  +  f(t)] 


the  optimum  gain  is  also  a  function  of  time  t 


(4.4-5) 


6 (t)  *  Rs(0)  +  Rn<0)  [1  +  f(t>]  Rg(0) 

whose  value  at  time  t  =  J  will  be  denoted  by  0 


(4.4-6) 


3  ‘ 

In  the  above  f(t)  is  a  time  function  and  depends  entirely  on  the  nonsta- 
tionarity  of  noise  fields.  This  function  will  be  a  constant  in  the  stationary 
case. 

At  time  t  =  j  ,  the  optimum  gain  can  be  written  from  Eq.  (4.4-6)  as 


e.  -  e  +  f. 
3  °  3 


where 


e  *>  [R  (0)  +  R  (0)]"1  R  (0) 
os  n  s 


(4.4-7) 


(4.4-8) 


B-79 


1*  the  C lac- invariant  optimum  gain  and 

F,  -  -  8  1  IR  (0)  +  R  (0)}"1  R  (0)f  +  {R  (0)  +  R  (0)J-2  «2<0)fj  + 

joja  n  n  s  n  n  j 


-} 


(4.4-9) 

ia  the  time-varying  part  resulting  from  a  simple  series  expansion. 

Thus,  the  optimum  gain  at  any  instant  can  be  related  to  its  previous  value  by 

Vl  “  6j  +  ASj  (4.4-10) 

A0  ^  is  the  increment  of  0^  at  t  ■  j  and  given  by 

A9j  "  Fj+i  “  Yi 

•  VVr  VR»(0>  | _l  +  lR.(0)  +  R»(0),'lR«(0)(<j+i+  V  ■*■•••] 


(4.4-11) 


Let  the  dominating  part  of  A0  be  denoted  by  0 ( — ^ — ) . 

3  j" 


Then  we  have 


Vi'  V01^ 


(4.4-12) 


Up  to  now  we  have  not  posed  any  restrictions  on  to  so  that  Eq.  (4.4-12)  Is 

valid  in  general. 

h)  Assumptions  and  Analysis 

It  is  assumed  that  in  using  the  adjustment  procedure 

c  „  =  c  -  y.  70,  (4.4-13) 

J+l  J  J  i 

the  following  conditions  hold 

(i)  There  exist  constants  and  such  that 

Kj  ^  -  6^1  <  |7Q^|  <  KjCj  -  Ojl  (4.4-14) 

for  all  j  .  This  simply  says  that  the  gradient  is  of  bounded  variation. 


B-80 


Note  that 


♦ 


and 


“  VQ(cj)  <  0  when  >  0^ 


EfVQ^ | cx* Cj]  «  (c^  -  e^)u 


(4.4-15) 


(4.4-16) 


for  Borne  u  ,  K„  <  y  <  K 
£.  u 

(ii)  The  conditional  variance  of  VQ^  is  also  bounded 

Var  [VQ  Cj]  <_  c2  <  ® 

— jf  —2 

Since  Var  [x]  ™  x  -  x  ,  we  can  write  from  Eqs.  (4.4-16)  and  (4.4-17) 


(4.4-17) 


Et(7Qj)Zic]L>**,»Cj] 

=  Var  [^Qj  I  c]^»  ‘  «  Qjl  +  |  EIvQj  I  c^i  •  •  • .  c j  ]  j 

o  o  '  * 


1  +  (Cj  -  e  )V 

(iii)  The  weighting  sequence  is  of  the  form 


(4.4-18) 


TJ  ’  f  ■  y  ”  °  •  2  <ail  ' 

Subtracting  Eq.  (4.4-13)  by  Eq.  (4.4-12) 


(4.4-19) 


Vi  ‘  Vi  ‘  °j  '  9j  -  Tj  ,Qj  * 0  'JT1 

and  squaring  give 

<Vi  -  V’2  •  °i '  V2  +  A  <5Qi,z  +  0  <pr> 


(4.4-20) 


-  2Yj  VQj  (Cj  -  6^)  +  2Yj  VO  0(— ) 


-  2  -  V  °«p-> 


(4.4-21) 


B-81 


Taklnj  th«  conditional  expectation*  on  both  §14<«  of  Iq.  (*.*-21)  yield* 
E^cj+1  “  ej+l*  *  C1  * '  *  *  *  cj^ 

"  -  V2  +  YJ  EK7Qj)2|c1.---Cj]  + 

-  2  Yj^cj  "  6j‘>  Ef7(5j  !=]_*•  •*»cj3 


+  2  Yj  0{—~— )  F- [ 7Q_J  j  Cjl 

-  2  (Cj  -  9j)  0(-l) 

From  Eqs.  (4.4-16)  and  (4.4-18)  it  follows  that 

.2, 


E[(Cj+i  6j+l^  i  Cl*  *  '  *  »  Cj 


] 


(4.4-22) 


i  (*rV2  +  Yj2  [^2  +  ('cj  -  Vv  +  ^ 


-  2  yjUj  -  6,,*  „  *2  r,  <Kp>Uj  -  »3I» 


where  the  K^'s  are  constants. 

We  shall  now  consider  several  ranges  of  u  relative  to  a  . 


(4.4-23) 


Case  1.  uj  >  a  . 

After  enlarging  the  corresponding  coefficients,  the  terms  of  lower  order  of 


B-82 


Magnitude  will  include  the  ttm  of  higher  order.  Thus,  for  «  »  a  , 

Eq.  (4.4-23)  becomes 

E^(t3+1  "  ej+l'>  ^c2.» * 

<  (i  -  ^><c  -  e  )2  +  ^  ^  |c  -  e  !  (4.4-24) 

j  J  J  3  3 

Now,  wc  take  (unconditional)  expectations  on  both  sides  of  Eq.  (4.4-24).  Wien 
estimating  E(|  c  -  8^|]  ,  we  use  the  inequality  [47] 

E[ jx| ]  <  e  +  e'1  E[x2]  (4.4-25) 


The  inequality  (4.4-25)  holds  true  for  every 
with  finite  variance. 

If  we  set  e  *>  — for  some  small  6  >  0 


of  Eq.  (4.4-24)  is 


E1V1  '  V5  1 

K4  2  "2 

i  a  -  -i)  E[(cj  -  y2i  *  4j 

3  J  j 


K6  [  1 

3“  1  43W'° 


+  6  f'a  E[(Cj  -  9j)2] 


c  >  0  and  every  random  variable 


,  then  the  unconditional  expectation 


K  K 

tt-y’WvV  ’*  :fcr 

3  J  J  j 


.2a 


A  lemma  due  to  K.  L.  Chung  will  be  used  here. 
Chung ’ a  lemma  :  Let  ,  j  =  1,2 . . . be 


3  >  J 


Vi  <  (1  -  v3 +  7 


(4.4-26) 

real  numbers  such  that  for 


(4.4-27) 


i 

| 


Chung,  K.  L.,  "On  a  Stochastic  Approximation  Method," 
Ann.  Math.  Statist,  voi.  25,  pp.  463-483,  1954. 


B-83 


where  0  ■*  *  '  1  ,  a  *  0  .  b  >  0  ,  t  real. 


llm  sup  JC  s  V 


J  "  a 


<4.4-28) 


This  lemma  remains  true  If  the  Inequalities  (4.4-27)  and  (4.4-28)  are  reversed 

and  simultaneously,  llm  sup  is  changed  into  lim  inf. 

rr«*>  rr*“ 

Upon  using  Eqs.  (4.4-27)  and  (4.4-28),  we  have 
E[  <Cj  -  =  0(3  a)  for  u  >  a 


E((Cj  -  9j)^  =  0(j  2a'  +  Zct)  for  a  <  id  <  j  a 


(4.4-29) 


2ase  2.  a)  <  a 


Under  this  situation  Eq.  (4.4-23)  reduces  to 


E[Cj+1  -  9j+1)  lc1,-'-cjl 


K.  0  K,  K, 

<  (i  -  — )<C,  -  e,)2  +  —  Ic  -  e,|  + 

-  ja  J  3  3  i  j2w 


(4.4-30) 


If  we  take  the  unconditional  expectations  on  both  sides  of  Eq .  (4.4-30)  and 
follow  similar  steps  leading  from  Eq.  (4.4-24)  to  Eq.  (4.4-26),  we  obtain 


K  k 

-  V’2’  ±  u  -  E1UJ  -  v2)  +  7^ 

J  J  J  j 


Invoking  Chung's  lemma  gives 

e[ ( Cj  -  e^)2]  -  o(j"2a_2u) 


(4.4-31) 


(4.4-32) 


Since,  by  assumption  in  this  case  u  <  a  ,  we  see  that  the  sequence  (c.^  * 
j  =  1,2, . will  diverge. 

Case  3.  w  =  a  . 


Following  steps  similar  to  the  above  two  cases  and  letting  e  be  a  constant, 


B-84 


‘  -j+ir 


J  *1 

■'i1  -  11  -  7)(c3  -  ‘j1  *  t; 


and 


*  7  k4  -  V 


«<Vl  •  Vl)J)  -  “  -  7>  El('3  -  V'1  +  jfc 


+  -4  e  +  c_1  -  Sj)2] 


(1  -  K3/ja)  E[(Cj  -  e^2]  +  K6/ja 


(A. 4-33) 


(4.4-34) 


Chung ' a  lemma  gives 

E[(e.j  -  9j)2]  *  0(j°)  =  constant 


(4.4-35) 


c)  Conclusion 

Thera  are  several  points  worth  noting  in  the  above  analysis .  All  r.hc.  results 

are  intuitively  reasonable.  The  inequality  w  >  a  indicates  that  the  rate  of 

parameter  variation  is  slower  than  the  rate  of  convergence  in  the  stationary  case 

(in  the  order  of  j  a).  If  the  rate  of  time-variation  is  relatively  slow 
3 

(ui  ^  ^  a)  »  the  ordinary  method  of  stochastic  approximation  can  be  employed  to 
adjust  the  time-varying  parameter  without  affecting  the  rate  of  convergence.  On 
the  other  hand,  if  the  optimum  parameters  vary  at  a  rate  slower  than  but  compa- 
rable  to  the  rate  at  which  decreases  (a  <  w  <  ^  a) ,  the  actual  rate  of  con¬ 

vergence  is  reduced  by  an  amount  depending  upon  the  difference  (u-a) .  Suppose 
that  the  rate  of  parameter  variation  is  faster  than  that  of  convergence,  we  can 
never  expect  to  have  the  algorithm  converge  to  the  desired  value  at  any  time. 


B-85 


fll'M 


Thta  U  indicated  ia 


1«>r 


?  where 


■  i 


In  short,  the  interrelationships 


Cif  '-J,  ■». 


and  the  rate  of  convergence  are 


V  ;J 


W  a 


a 


a 


V  _ *  constant 

i 


0(j 


-2±-2a^ 


V,  -  0(j“a) 


a> 


B-86 


2 


a urn*  nvt 


t 


PERFORMANCE  ANALYSIS  OF  THE  ADAPTIVE  RECEIVER 
5 •  I  Introduction  and  Assumptions 

In  Chapter  Two  the  rather  practical  situation  in  which  filters  in  an  array 
consist  of  weighted -tapped- delay  lines  were  considered.  It  is  shown  that  tapped- 
d&lay-line  filters  can  approximate  the  continuous  Wiener  filters  quite  closely 
with  proper  delays  and  proper  weights.  The  method  of  stochastic  approximation 
and  a  mean  square  error  criterion  were  employed  in  Chapter  Three  to  derive 
adjustment  procedures  using  signal  statistics. 

An  adaptive  array  receiver  is  formed  by  incorporating  these  adaptive  tapped- 
delay-line  filters  in  an  array  as  shown  in  Fig.  3.  In  this  chapter  we  shall  study 
the  performance  of  such  a  processor.  The  performance  criteria  to  be  evaluated 
are  the  output  signal-to-noise  ratio  and  directivity  patterns.  These  quantities 
depend  on  a  number  of  system  parameters  such  as  field  (target,  noise.  Inter¬ 
ferences)  properties,  number  of  hydrophones  in  an  array,  number  of  taps  and  their 
spacings  on  the  tapped-delay  lines,  adaptation  time,  locations  of  the  target  and 
Interferences,  etc. 

The  following  assumptions  are  used  to  simplify  the  analyses: 

1)  Target,  interference  and  ambient  noise  are  assumed  to  be 
Gaussian  random  processes. 

2)  The  receiving  array  is  assumed  to  be  linear  and  to  consist  of 
K  omnidirectional  hydrophones. 

3)  The  wavefronts  of  target  signal  and  interference  are  regarded 
as  plane  over  the  dimensions  of  the  receiving  array. 

4)  The  sum  of  interference  and  ambient  noise  is  regarded  as  the 
effective  noise. 


B-87 


5)  The  input  spectra  are  identical  in  shapes  (but  not  in  levels) 
over  the  frequency  range  (0,  w^)  where  most  of  their  power  is 
concentrated .  This  situation  closely  resembles  conditions 
encountered  in  practice  if  one  ignores  periodic  components  of 
the  irput  processes. 

6)  The  noise  consists  of  a  single  point  interference  and  ambient 

noise . 

7)  The  ambient  noise  is  statistically  independent  from  hydrophone 

to  hydrophone. 

Mathematically ,  the  above  assumptions  are  equivalent  to  the  following  equations. 
(1)  Ratios  of  the  input 


t  .(■“)/*„  u) 

d  n 

-  S/N 

■t  (y)'i  (w) 

n  i 

»  N/I 

(5.1-1) 

=  I/S 

for  0  o’. 

w  ,  u)  is 
0  0 

large.  The  spectra  are 

zero  elsewhere. 

(2)  Spectral  Matrices 

Signal  $ 

•  -ss 

*T 

‘  *d  i  « 

*T 

T  b  b 

(5.1-2) 

Noise  $ 

“  $  1  +  $ 

— nn 

n  — 

With  the  aid  of  a  matrix 

:  Inversion 

formula  [71^ 

T  -1 

Le  +  ix  1  =  £ 

-  '•(£1£)<XTe”1)/<1  +  XT 

we  have 

-^nn 


T 

b*  b 

K  +  * 

n 

A 

rI 


(5.1-3) 


If 


exists  and 


T 

V 


Is  of  rank  1. 


B-88 


where 


T  r  3WT1  ^wt2 

a  »  [e  e 


jWT 

...  e 


t  wOj  3wp2  j'*tpk 

2  e  .  . .  e  A] 


(5.1-4) 


and  for  a  linear  array  of  equally  spaced  hydrophones, 


T<  ~  T{,  ~  ■l!c  E3ri  ’  °1  -  pb  =  ^  ~  s^n  S] 


(5.1-5) 


together  with  tha  following  definitions 
d  «  hydrophone  spacing 

c  *  sound  velocity  in  water 

0T  •  target  angle 

-  interference  angle 
_!  »  unity  matrix 

5. 2  Statistics  of  an  Array  Receiver 

It  will  be  necessary  for  all  cases  to  obtain  expressions  for  the  mean  and 
variance  of  the  detector  output.  Some  useful  expressions  for  the  related  spectral 
densities  and  spectral  matrices  will  be  obtained  first.  Referring  to  Figs.  1  to 
3,  the  beamformer  output  is 


z(t)  "  i£i  Ti(t> 


and,  therefore,  its  autocorrelation  function  is 


K  K 

Rz(x)  -  E(z(t)  z(x+t)}  =  kLx  E{y  (t)  y  (x+t)> 


(5.2-1) 


■  iSi  ai  \l1kw 

Tlie  power  spectral  density  of  z  is  consequently 


(5.2-2) 


K  K 


K  K 


*z(u)  "  ±h  k±i  ^ivk(“)  =  1=1  fc=i  Vu)  V"0 


(5.2-3) 


3-89 


where  *  indicates  the  conjugate.  If  the  tranafer  function  vector  B(u>)  is 
defined  by 


K  (w)  -  [HjO-''  H2(^)  ••• 


(5.2-4) 


then  Eq.  (5. 2*3)  can  be  written  compactly  as 

$__(«)  ■  1^“'-  3*(w) 

T 

where  H  (oj)  is  the  transpose  of  K_(ui)  and 

f  ♦„  . . .  4-  1 


(5.2-5) 


;  ;  (s.2-6> 

.  Vi  WJ 

is  the  input  spectral  matrix. 

If  the  referenced  signal  d(t)  and  noise  n^Ct)  are  assumed  to  be  un- 
correlatad  fo  all  i  ,  then 

R  (t)  *  R  (t>  +  R  (r) 

Xi*k  Vk  SiSk 

"  ♦n1nkt“>  +  *Si8k<w) 


4>  -  <t  +  $ 

—xx  -SB  "Tin 


(5.2-7) 


The  4>’s  are  understood  to  be  functions  of  u  .  $  and  ♦  are  respectively 

—  — ss  —nn 

the  signal  and  noise  spectral  matrices  defined  by  Eqs.  (2.1-3).  One  very  useful 
property  of  the  spectral  matrices  is  that  they  are  Hermitian,  i.e., 

±*T  .  *  (5.2-8) 

Assuming  without  loss  of  generality^"  that  the  averaging  filter  has  unity  gain  at 


1  This  assumes  that  the  filter  does  not  have  any  poles  at  u>«0;  i.e.,  does  not 
contain  an  integrator.  Thus  this  assumption  is  not  completely  general,  but  in 
practice  integrators  will  always  be  of  finite  time.  So  it  is  not  a  very  serious 
loss  of  generality. 


B-90 


w  =  0  ,  the  detector  output  will  have  an  average 


hav  <t)  z2(x_T'rjT  "  z2  Hav(“  “  0) 


z2  °  Z1  “  2 it 


t>  (w)d'i) 

“1 


<w)  -  |g(«o)]2  «  (u) 


Hence,  in  the  presence  of  signal,  using  Eq .  (5.2-5) 


<y>0.„  ”  j  G ( 2  K‘  4  K  do 

J  S+N  Air  1  1  —  -xx  — 


(5.2-9) 


(5.2-10) 


(5.2-11) 


in  the  absence  of  signal 


N  27 


;.j2  HT  $  H*  do 
1  —  -nn  — 


(5.2-12) 


and  the  d.c.  change  of  the  output  becomes 


yd.c.‘  <y>S+N  -  <y> 


■  oo 

M  ~  |g|2  HT  ®  H*  do 

N  2tt  1  1  —  — ss  — 

.CO 


(5.2-13) 


In  order  to  obtain  a  convenient  expression  for  the  output  variance,  assume  that 
the  averaging  time  T  is  long  compared  to  the  correlation  time  of  z^(t)  ,  or 
equivalently,  that  the  bandwidth  of  the  lowpass  averaging  filter  Hav(w)  Is  much 
narrower  than  the  bandwidth  of  z^(t)  .  Then  [23,  24] 


2  _1_ 

sy  -  ** 


[  d*  (<*>)  ]  d(n 

Z1 


|  g|  4  (HT($  +  4>  )  R*}2  du 

1  —  -ss  — nn  — 


(5.2-14) 


B-91 


where  is  the  averaging  tine  defined  by 


r 


l/T  A 
av  — 


h  (t)dT 
av 


(5.2-15) 


So  far  we  have  assumed  that  the  individual  filters  in  an  array  are  con¬ 
tinuous  filters.  If  tapped-delay  lines  are  uaed  to  replace  them,  then  referring 
to  Fig.  A  the  following  expressions  are  obtained # 

Let  the  ic^  individual  filter  be 
M  -JuA, 


V“>  '  k£0  Cik  * 


ik 


th 


(5.2-16) 

th 


where  and  A^  ars  the  weight  and  spacing  at  the  k  tap  an  the  i 

filter. 

The  weights  assume  different  values  depending  on  the  training  environment 
The  post-summation  filter  G(w)  is  fixed  at  all  times  by  (2.2-9),  i.e.. 


G(ui)  *  <t>d  (m) 


(5.2-17) 


Substituting  Eq.  (5.2-16)  and  (5.2-17)  into  Eqs.  (5.2-9)  through  (5.2-14),  we 
obtain  the  following  statistics 

r 


S+N  2ir 


t  G  |  2  HT  *  H*  dui 

'  1  —  —XX  — 


K  K  M  M 

ill  h*l  k=0  Jt=Q  °ik  Ch£  Yn 


■—I 

dm  $  e 
d  xi*h 


ju)  V  -  V 


(5.2-18) 


<y> 


N  2  it 


|  G !  2  HT  «  H*  dw 
1  1  —  — nn  — 


K  K  M  M  - 

i=l  h-1  k-0  d=0  Cik  CkS,  2n 


,  -1  j“(iih£.  ~  &ik^ 

du  tp ,  *  e 

d  ni% 


(5.2-19) 


B-92 


rd.c. 


<y>S+N  “  <y>N  "  2« 


Ig|2  HT  ♦  H*  dw 


R  K  M  M 
i“l  h=l  k^O  il£o  Cik  Chi  2? 


juxvV  j“(ahr  V 

du  e  e 


(5.2-20) J 


r2  = 


y  -  ttT 


av 


i G) ^  (HT  $  H* } 2du 
—  — nn  — 


KK  KK  MM  MM 

-EE  EE  Z  Z  ZZC 

i-1  i'-i  h»l  h'-l  k»0  k'-0  £-0  Jl'-O  ±ic 


.  -2  .  .  >(Aht  + 

du)  4> ,  <fc  A  e 

d  “1%  w 

oQt 


Ch£  V*’ 

"  &ik  “  Vk'* 


(5.2-21) 


where  <J>  (or  <J>  )  Is  the  ljC^  element  of  the  input  (or  noise)  spectral 

*1*11  "1% 

matrix  and  Eq.  (2.2-21)  is  valid  for  the  case  of  small  signal-to-noise  ratios. 

It  is  readily  seen  that  if  every  gain  C^k  (i  =  1,2, ...k  and  k  -  0,1, ...M) 
is  multiplied  by  a  constant  such  as  the  ease  of  uncertain  signal  power,  the  final 
value  of  the  output  signal-to-noise  ratio,  defined  as  the  change  of  dc  level  due 
to  the  appearance  of  a  target  signal  divided  by  the  rms  fluctuation  of  the  output, 
remains  essentially  unchanged . 

5.3  Initial  Behavior 

Assuming  the  worst  case  where  absolutely  no  information  about  the  noise 
field  is  known,  the  gains  associated  with  the  input  delayed  by  (i  =  1,2,... K) 
are  set  to  1  and  the  re3t  to  zero  so  that  a  square-law  detector  is  used  at  the 
starting  moments.  Here  the  output  of  each  hydrophone  is  delayed  to  provide  maxi- 


The  integral  appearing  in  Eq.  (5.2-20)  will,  in  general,  yield  delta-functions 
with  infinite  strength  at  certain  instants.  This  difficulty  does  not  arise  in 
practice  since  most  processes  are  bandlimited  and  the  range  of  integration  Is 
0)^  to  *  where  and  are  finite  numbers. 


B-93 


mum  response  in  the  signal  direction,  i.e., 

«!<“>  -  a*  (5.3-1) 

The  weights  and  spacings  are  simply 


■w 

K  K  K  K  ° 
•III  Z  du 

1=1  h-1  i'=l  h'-l 

J  a 


^2  (K  +  i  |a*T  b|2)2du> 


(5.3-4) 


a)  The  Output  Signal-to-Noise  Ratio 

Dividing  Eq.  (5.3-3)  by  the  square  root  of  Eq.  (5.3-4)  gives  the  output  slgnal- 
to-noise  ratio 


B-94 


(5.3-5) 


The  term 


appearing  in  the  numerator  of  Eq.  (5.3-5)  is  propor¬ 


tional  to  the  side  lobe  level  in  the  direction  of  interference.  For  narrow  band 


systems  which  have  pronounced  side  lobe  structure,  the  Signal  to-noise  ratio  is 
seen  to  depend  on  the  side  lobe  level  in  the  interference  direction.  This  cer¬ 
tainly  agrees  with  our  expectation. 

We  shall  now  evaluate  the  integral  for  the  case  of  similar  input  spectra. 
Note  that 


K  K  jw(ph~  p±) 

j  £  *  ,  £  i  6 

i°l  -h*l 


K  K 

t*i  h£i  cos  “  V  pi> 


K-l  K 

■  K  +  2  h-i+1  C0S  “  ^ph”  Pi^  (5.3-6) 

The  value  of  the  double  sum  in  Eq.  (5.3-6)  can  be  further  evaluated  for  our  case 
of  a  linear  array  with  equal  spaced  hydrophones.  If  such  an  array  is  steered 
broadside;  i.e.,  if  the  target  Is  at  a  location  perpendicular  to  the  array  axis, 
then 

ch~  pi  “  lh  “  i|  ~  sin  -  |h  -  i|po  (5.3-7) 

and  the  double  sum  in  Eq.  (5.3-6)  can  be  replaced  by  a  single  sum 


K  K  Jta(p  -  p  )  K-l 

,E.  .  E.  e  n  =■  K  +  2  .Z  (K  -  i)  cos  u  i  P  (5.3-8) 

xmX  h»l  1=1  o 


B-95 


Using  Eq.  (5.3-8),  we  have 


,„U),2  .  A2  * 
<°y  )  '  <?>  W 


I  l&V  2 
<l  +  5  —i—  >  d“ 


N  ^  K  0  I  ^  ^ 

•  <!>«-  •*“  t‘  +  s  iixJi* 


I  l  i  1"<V  »1>,  2 


-  A2-£- 

VSJ  ttT 


U  +  |  fK  +  2  £  (K-i)  cos  uiPoJ}2  dm 


„  2  Kiii  /  K-l  sin  to  ip 

.  (£)  1+Ir2  +  1  r  £■  O 

V  tiT  N  1  K  iil  (U  ip 

aV  ^  <1  n 


t2  .  K-l  sin  u  ip 

i~  {1  +  -  r  -  0  -° 

h2  K  iil  u  ip 

N  o  o 


2  K-l  K-l  sin  ui  (i-h)p  sin  id  (i+h)p  v 

+  ^  ik  h£l  t— &gp  ^  +  -Wp-  °-HK-l)(K-h)}] 

(5.3-9) 

In  most  practical  cases  the  maximum  frequency  processed  is  very  high  such  that 
o>odo  >>  1  .  Then  the  sums  associated  with  pQ  make  negligible  contribution 
except  for  i  «  h  and  we  have  a  simpler  expression 


1  t  |a*Tb|2  2 
+  £  — — >  du 


‘o{1+^+7tl  +  ?i^2]} 


•  “o  11  +  25+:i 

N 

The  output  signal-to-noise  ratio  becotaes 


H  S  f  |K  {l  +  2I  +  i?(|K  +  ^  2 


(5.3-10) 


(5.3-11) 


B-96 


For  most  cases  of  practical  Interest,  the  number  of  hydrophones  in  an  array  is 
large  K  >>  1  so  that  for  ambient-noise-dominated  environment 


a  K  (|)  when  (~)  »  |  K 


(5.3-12) 


and  for  interference-dominated  environment 

I  2 

SNI^  a  K2  (|)  when  (|)  «  |  K 


(5.3-13) 


The  results  concerning  the  output  signal-to-noise  ratio  have  been  previously 
derived  by  Schulthiese  1 32 3  for  a  conventional  power  detector  under  the  assump¬ 
tions  that  the  interference  and  ambient-noise  are  white  over  0  <  w  <  u  and 

-1  0 

G(w)  =•  1  .  In  our  case  G(w)  *  2  and  the  input  spectra  are  similar  rather 

than  constant  over  the  same  frequency  range.  This  is  equivalent  to  inserting  an 
Eckart  filter 

<M“) 

|G  (w)|2  -  -f -  (5.3-14) 

E  ♦*(«) 


after  the  beamforming  point  in  the  absence  of  interference.  The  effect  of  in¬ 
serting  this  filter  has  also  been  considered  by  Schulthciss  and  reported  else¬ 
where  [39], 

b)  Directivity  Patterns 

The  average  output  of  the  squarer,  y  ,  yields  the  so-called  directivity 
pattern  which  may  be  obtained  by  varying  the  electrical  time  delays  and  keeping 
the  physical  orientation  of  the  array  fixed,  or  by  keeping  the  electrical  time 
delays  fixed  and  varying  the  physical  orientation  of  the  array  relative  to  the 
plane  wave  signal.  It  is  a  function  of  the  target  bearing  relative  to  the  bear¬ 
ing  angle  of  the  major  lobe  of  the  array  pattern. 

Let  8.J,  and  9^  be  the  target  and  interference  bearings  relative  to  the 
broadside  condition  with  the  convention  of  signs  that  angles  are  measured  clock- 


B-97 


wise  from  broadside. 


■  Target 


Interference 


Array 


et  «- 

'f  we  define  the  following  terras  for  a  linear  array  of  equally  spaced  hydrophones. 


t  -  —  sin  6_ 

o  c  T 


P  «  —  sin  eT 

o  c  I 


?  “  —  sin  e 

o  c 


(5.3-15) 

(5.3-16) 

(5.3-17) 


th 


d  •  hydrophone  spacing 

c  -  velocity  of  sound  in  water 

then  the  signal  and  interference  delays  at  the  i1"'*  phone  in  a  linear  array  with 
equal  spaced  hydrophones  would  be 

(5.3-18) 
(5.3-19) 


t  -  (K-i) T 
i  o 


Pj  ■  (K-i)pQ 


and  the  steering  angle 


’l  •  «-1>r0 


(5.3-20) 


where  t  ,  the  looking  angle  appearing  in  the  individual  filters,  is  the 
o 

independent  variable  of  the  directivity  pattern. 


B-98 


It  is  seen  from  Eq.  (5.2-18)  that  the  averaged  output  la 


2n 


d(«  |g|2  Ol  $  h*) 

1  1  —1  -xx  — 


K  K 

i-1  h«l  17 


1  -i  jw(V  V 

d“  *d  l*d  e  +  *n  6ih 


ju>(p-ph)  )«(Vt) 

+  e  ]  e 


K  K  . 
1^1  h-1  27 


o  jw(i-h) (t  -  t  ) 

du[e  °  0 


o 


I  J«(i-h)(P0-  T0) 
S 


,4.  0  0, 

+  -  e  ] 


du 


(5.3-21) 


Upon  using  Eqa.  (5.3-6)  and  (5.3-8)  and  carrying  out  the  integration,  we  have 


yi(6)  “27  K(1  +  I  +  ¥ 


K-l  sin  w  1(t  -  t  ) 

+  2  (K-i)  o  o  o 


a)  1(t  -  t  ) 
o  o  o 


_  sin  <*»  i(p  -  r  ) 

][  _ Q  Q  Q 

S  u  i(p  -  T  ) 

o  o  o 


(5.3-22) 


%  j  ^ 

In  the  above  p  =  —  sin  9_.  ,  t  ■  -  sin  L  ,  t  “  —  sin  8 ,  0  being  the  independ- 

oc  I  oc  T  oc 

ent  variable  in  calculating  the  directivity  pattern  y(©)»  If  target  and  inter¬ 

ference  are  well  separated  in  bearing,  the  directivity  pattern  will  take  the 
general  form  shown  below  because  of  the  plus  signs  appearing  in  Eq.  (5.3-22). 

-p  y(8) 


'  v 


|  \ 

1  \ 


-->8 


B-99 


In  the  signal  direction  t  ■  t  and  for  id  p  »  1 

o  o  o 


n<e  -  «t>  •  g  K  uf  + f>+  K  I 


(5.3-23) 


Similarly  in  the  interference  direction  t  ■«  p 

o  •  o 


y^9  -  6j)  = 


jfx  ia  +  |> 


(5.3-24) 


and  in  any  other  directions 


’  2?  K(l  +  f  +  ?> 


(5.3-25) 


5.4  Final  Behavior 


a)  Optimum  Gains  and  Training  Environment 


The  final  form  of  our  adaptive  array  processor  is  the  one  in  which  all  the 
gains  are  set  at  their  optimum  values.  From  the  convergence  properties  of  adap¬ 
tive  tapped- delay-line  filters  we  know  that  the  final  values  of  the  gains  are 
different  under  different  training  environment.  They  are 


(R  ,  +  O'1  R, r 


(5.4-1) 


-  R  ”1  R, r 
—v  -415 


(5,4-2) 


where 


lc10  C11  ~  C1M  C20  C2M  CK0 


— 1 TIM1 


R  ■  E[v_  v_  ] 


[s^Ct)  s^Ct  -  A)  -  sK(t  -  MA) ] 


[n^ft)  n^( t  -  A)  •••  r^Ct  -  MA) ) 


as  indicated  in  Fig.  4, 

It  has  been  shown  in  Sect.  2,4  that  for  the  tapped-delay-line  filters,  if 


B-100 


the  gains  are  set  according  to  Eq.  (5.4-1)  or  (5.4-2),  they  are  approximately 
equivalent  to  the  optimum  filters 


„<s«>  .  (, 


-i 


+  <t>  ) 

-88  — nn 


4  a  1 


(5.4-3) 


/-1 

—  nn 


£• 


(5.4-43 


The  accuracy  of  the  above  approximations  depend  on  the  number  of  taps  used.  For 
the  case  of  similar  input  spectra  we  shall  see  that  filters  defined  by  Eq. (5.4-3) 
or  (5.4-4)  can  be  realized  completely  by  tapped  delay. lines  with  proper  settings 
and  proper  spacings . 

Although  the  optimum  gains  are  difficult  to  be  expressed  analytically  using 
Eq.  (5.4-1)  or  (5.4-2),  they  can  be  computed  in  the  frequency  domain  by 


ik 


1_ 

2tt’ 


K,  (m)  e 


jldt 


dm 


J  -qd  1  t  m 

for  the  ktb  gain  on  the  ith  filter.  H^w)  is  Just  the  ith 
(5.4-3)  or  (5.4-4). 

We  shall  first  of  all  consider  Eq.  (5.4-4). 


(5.4-5) 

row  of  either  Eq. 


Since 


,(N) 


*-l  * 

•  a  -  [I  - 


*  T 
b  b 


-]  a 


^nn  K  +  ♦  /i()  1  ± 

.th  , 
its  i  row  is 


(5.4-6) 


,00 


Mr*  bi  k-1  bk  ak. 


/  u.  f  x  Ik  x  B.  ■% 

Ht  ■  t  [*i  -  k  ..;t, 


(5.4-7) 


so  that  the  impulse  response  is 


hi(t)  *  2rf 


te 


-j«p1  K  Ju(p  -  tk) 

o  y  — 

i  ”  k=l 


$  .  -Ju)TJ  3  ,  e 


TTT7TT 

n  l 


)  eJu,t  dt 


B-101 


S  1_ 
N  2  it 


du> 


£  l  JL 

N  K  +  N/j  2tt 


K  J«(t  -  P±  +  Pk  -  V 

d“  kSl  e 


(5.4-8) 


as  is  shown  In  Section  2.5,  Eqs.  (2.5-12),  (2.5-13) 

r(N)  _  S  1  v 

Cik  N  (5ik  K  +  N ' 

I 

at 


(5.4-9) 


Aik  "  pi  -  pk  +  Tk 


(5.4-10) 


If  the  gains  are  adjusted  in  the  presence  of  target,  we  shall  use  Eq, (5.4-3) 
Instead  of  Eq.  (5.4-4). 

Since 


-1  ,  *  ,  * 


*  T.-i  .  * 

a 


(4*  +  ♦*  )  *d  &  *  <*nn  +  *d  &  ±  - 

— S8  — nn 


*_1  *  t  *-l 

*  .  4,  4  a  a  9 

r  .*-1  d  -nn  —  —  -nn  ,  , 

.  -  - fXT«  1  - 

1  +  41,  a  9  a  ° 

d  —  -nn  — 


* 

*  -nn  ^d  — 


•-nn 


T  *-l  * 

1  +  4  ,  a  4  a 
Td  —  -nn  — 


(5.4-11) 


and  let 


-*Tl I  2 


T  *-l  *  r  '  —  — '  , 

1  +  9,  a  *  a  -  1  +  —  [K  -  „  ,  r  1 

d - nn  -  4n  K  +  4n/^ 


A  jr 

K1 


we  have 


(5.4-12) 


h(S+N)  .  h(n) 


(5.4-13) 


For  the  case  of  similar  input  spectra,  is  just  a  constant 


N  +  KNI 


1  N2  +  KNI  +  KSN  +  SI(K2  -  K2) 


(5.4-14) 


B-102 


where 


2  i  *T  i  2 
K,  4  I  a  b|  <  K 


(5.4-15) 


with  equality  when  &  ■  or  when  target  and  interference  are  in  the  same  direc¬ 
tion. 

Therefore 


,(S+N) 

'ik 


*lCik  tot  1  "  K;  k-  lf2,..-tK 


(5.4-16) 


In  other  words,  fo'r  similar  input  spectra  the  optimal  gains  trained  under  noise 
alone  differ  from  those  under  noise  plus  signal  only  by  a  multiplicative  con¬ 
stant  .  Thus,  all  the  performance  criteria  (output  signal-to-noise  ratio, 
and  directivity  pattern)  remain  essentially  unchanged  regardless  of  the  training 
environment.  Of  course,  should  the  input  spectra  have  distinctive  shapes, 
and  c<?  would  assume  different  values.  If  the  signal-to-noise  ratio  is  small 
at  the  input  to  the  squarer 
*T  *-l 

4>,  a  ♦  a  <<  1 

d  —  — nn  — 

the  constant  is  close  to  unity  independent  of  the  input  spectral  shapes. 

In  view  of  the  above  discussions  we  shall  use  the  optimal  gains  defined  by 
Eq.  (5.4-9)  in  analyzing  the  final  behavior  of  our  adaptive  processor, 
b)  The  Output  Signal-to-noise  Ratio 

From  Eqs.  (5.4-9),  (5.4-10)  and  (5.2-20),  the  dc  change  of  the  output  is 


— 1 

yd.c  *  27 

K  K  K  K  g 
i-1  h-1  k=l  1=1  N  (iik 


|  G|  2  (HT  $  H*)  da) 


1  N  £  ( 

K+N/'  N  v  hi 


K+N/ 


_1_ 

2tt 


dw  e 


J“(ph  '  P!  +  T1  ‘  Pi  +  Pk  ~  V 


(5.4-17) 


B-103 


For  the  sake  of  simplicity,  we  shall  assume  that  the  array  is  steered  broadside. 


Then  t  ■  0  (1  >  1,2,***,K)  and  the  above  expression  reduces  to 

_/„>  s  2  K  K  K  K  . 

yd.c  “  V  l£l  h^l  k«l  fil  ^5lk  "  K+N/^  ^hl  "  K+H/ ^ 


1_ 

?TT 


°  „  j“<Ph  ~  '  Pi  +  PkJ 

doi  e 


'  o 


,S  2  Wo  i  _2  2K 
V  2tt  i  K  '  K+N/, 


K-l  sin  to  ip 

IK  +  2  lSl  (K-l)  ■— Vr-°1 


‘o  _  po 


or 


K-l 


(K+N/I>‘ 


[K  +  4K  ^  (K-l) 


sin  id  1  p 

_ o  O-i 

0)  i  p 
0  0 


K-l  K-l  sin  id  (i-h)p  sin  w  (i+h)p 

+  2  ±I1  hEx  (K-l)  (K-h)  (  --  . S-  +  ~ 1 


id  (i-h)p 
o  o 


0)  (i+h)p  *  ] 

o  o  j 


0)  ,S2  K2(K-1  +  N/t)2 

yd.c  “  ^  (N5  - §- 

211  (K  +  N/j)2 


(5. A- 18) 


K-l 


1  - 


K(K-1  +  N)  i-1 
I 


L  (K-i) 


sin  v  1  p 
o  t 

«  1  P„ 

o  o 


K2(K-1  +  N/j) 2  1=1  h’1 


K“1  K-i 

£,  (K-i)  (K-h)  [- 


sin  at  (i-h)p 
u0(i_h>p~ 


sin  uj  (i+h)p 
o 


(u  (  i+h)  p 
o 


)p„ 

-*] 

°  / 


(5.4-19) 


Let  us  now  consider  the  variance  of  the  detector  output.  From  Eqs.  (5.4-9), 
(5.4-10)  and  (5.2-2)  we  have 


(.‘•V 


1 

ttT 


r 


av 


|  G|  4  (H^  <f  H*}2  dw 


B-104 


KKKKKKKK 
1-1  i'»l  h«l  h'«l  k-1  k  -1  H-l  t'-l 


N7  vuik  ”  K+N/I''voi'k'"  K+N/IJVUhJl  ‘  K+N/^  (<Sh '  "  K+N/^ 


*  <£>  -  7ZS7->  <$«  m,  77^7")  («■ 


irT 


av 


.  _2  3“<Pi  -  Ph> 

*d  (*n  6ih  +  *1  e 


•  e 


—  Pk») 

ju)(Ph~  Pt  +  +  Ph,  -  P£t  +  Tfct  -  -  Tfc  -  P1 ,  +  Pfcl  -  Tk,) 


(*n  Si-h’  +  *1  e 


(5.4-20) 


Rearranging, 

,  .  2  K  K  K  K  , 

(V>  ■  iSihSikSnSi  W 


av 


d“  I  I  <6 


d  )  N  ik  K+N/, 


[*n  6ih  +  +1  6 


jw(pi  ~  Ph\  S 


]  N  <6hJt  “  K+N/t) 


-  P*  +  T*  '  Pi  +  pk  "  Tk>  J 


ttT 


av 


d“  1  X  Ji[  I (e 


1 


-jcotf  x  K  J“(-P±  +  pk  ■  V  . 
*  K+N/t  k=l  e 


r  3  t  "j“Th 

[  jj  <e 


i  ^  J«<ph  -  pft  +  v 

I,  e  J 


K-HN/j.  e.*l 


III  .  .  i. 

S5ih+  S  6 


I  ju<pi  “  ph>. 


ttT 


°  S  2 

dw  (I: 


i>  { - 


K+N/j  1  i*le 


K  ~  I 


2  ,  2 


(5.4-21) 


B-105 


For  ■  0,  It  reduces  to 


,„(»K2  _  _L_  ,£  2 

^  y  *  itT  _  V 


av 


dw  1  K2  -  ^ 


K+N/j  i-1  h-1 


K  K  ju(p  -  p  ) 

E,  e  1  n 


i_2  ,  ?  I  .2) 

N/t)2  [  i-1  h-1  6  5  j 


(K+N/ 


o  .S' 

it!  'n' 
av 


K-X 


sin  cii  ip 

o  O-i 


K2  "  1^7"  IK  +  2  ,2.  (K-i)  , 

K+n/t  i-1  w  i  p 

I  00 


K-X 


sin  w  i  p 


(K+N/T) 2  *K  +  4K  i-1  (K_1)  a)  ip 


o  0 


K-l  K-l 


sin  w  (i-h)p  sin  <d  (i+h)p  ^ 

*>  ] 


+  2  i.  .1.  (K-i)(K-h)  ( - 77-rr - -  + - 77^-x 

i=>X  h-1  w  (i-h)p  u  (i+h)p 


(5.4-22) 


Dividing  Eq.  (5.4-19)  by  the  square  root  of  Eq.  (5.4-22)  yields  the  final  form 


of  the  output  signal-to-noise  ratio 

1 


SNR 


.  T  u) 

1  ,  av  o 


_  K(K-1  +  N/_ \ 
JL  (  <xw  \  _  _ 1  * 

2  y  TT  t  N  K  +  N/_ 


K-l 


sin  id  ip 


K2(K-1+N/X)2  i£i  h“1 


K-l  K-l 

£,  (K-i) (K-h) 


sin  u  (i-h)p  sin  <d  (i+h)p  }  2 

f _ ° - 2.  + 

1  u>  (i-h)p  <d 


1  a;  (i+h)p  ) 

- - - — 1  l 

0  (i+h)p  J  J 
o  01 


(5.4-23 


If  m  p  >>  1  then 

A  _i 

SKR.  .  i  «■  W  |  <K-l,  (l  ♦  I 


1+N/KI 


3  K(K-1+N/I)2 


.  T  u  2  c 
5  \  (  ^  °)  |  (K-l) 


(5,4-24) 


B-106 


Eq.  (5.4-23)  or  (5.4-24)  gives  the  asymptotic  performance  of  the  adaptive  array 
processor.  Since  the  training  period  is  definitely  of  finite  time,  the  actual 
signal-to-noise  ratio  is  lower  than  that  given  by  Eq.  (5.4-23)  or  (5.4-24).  It 
is  to  be  noted  that  Eq.  (5.4-23)  is  just  the  output  signal-to-noise  ratio  of  an 
optimal  (likelihood  ratio)  detector  first  investigated  by  Schultheiss  [32]  and 
then  by  l'uteur  [33]  from  a  simpler  formulation  under  the  same  assumption  [similar 
input  spectra  over  (0,  «Q)1*  This  is  no  coincidence  but  stems  from  the  fact  that 
adequately  large  number  of  taps  is  used  (number  of  taps  per  hydrophone  output  * 
number  of  hydrophones  in  an  array)  and  that  for  similar  input  spectra  the  con¬ 
tinuous  individual  filters  are  Just  combinations  of  time  delays  and  constants. 
Therefore,  it  is  reasonable  to  expect  that  in  this  special  case  the  adaptive 
array  processor  can,  in  principle,  converge  to  the  optimal  processor  after  suffi¬ 
ciently  long  period  of  adaptation. 

In  the  absence  of  interference  the  behavior  of  the  processor  remains  un¬ 
altered  throughout  the  training  period  because 


.  T  o)  2  „ 

SNR.  =  SNR  -  (  ---  °)  |  K  for  I  «  0 

1  /  ti  N 


(5.4-25) 


Without  interference  we  do  not  take  advantage  of  the  reason  why  the  optimum 
detector  is  superior  to  the  conventional  detector: 

1)  It  can  combat  noise  correlation  between  hydrophones  and 

2)  It  can  utilize  variations  in  input  signal-to-noise  ratio  over  the  processed 
f  requency  b  and . 

Since  from  Eqs.  (5.4-19)  and  (5.4-21)  we  can  write 

!  *TlI2  4 


SNR.  “  I 


_av 

7T 


(V 

v 


i  - 1,  i . 

|3  k! 

k+n/t 


]2  du) 


(5.4-26) 


we  see  in  the  above  equation  that  in  the  absence  of  the  interference  or  for  very 

]a*Tbr 


large  values  of  N/I  the  side  lobe  factor 


K+N/t 


approaches  zero  and  output 


signal-to-noise  ratio  depends  linearly  on  the  number  of  hydrophones  K.  For  very 


B-107 


1  a  ibl  2 

strong  interference  (N/I  «  1),  the  same  quantity  approaches  to  *—  — 1  /K  . 

The  size  of  this  side  lobe  in  a  fixed  direction  depends  on  frequency.  In  a  narrow 
band  system  large  variations  in  this  term  may  occur.  That  is,  there  will  be  dis¬ 
tinct  maxima  and  nulls  in  the  side  lobe  pattern.  In  a  broad  band  system  the 
magnitude  of  the  side  lobe  Is  averaged  over  frequency^ 


c)  Directivity  Pattern 

Using  Eq.  (5.2-18),  (5.4-9)  and  (5.4-10), 

r 


y. 


i_ 

2x 


,  ,  ,  "T  ** 

G  2  H  $  H 

I  — oo  —XX  ~ 


du> 


—CD 


we  have 


i-1  h=l 


k£i 


(— ) 

lV 


(5ik  “  K+N/ 


K+N/^ 


1_ 

2tt 


j  _i  3w(t  -  x  ) 

d“  *d  [*d  e  +  *n  6ih 


jco(Pi  -  Ph)  jU(Ph-  P2  +  Tt-  p4  +  Pk  -  Tk) 
♦j  e  J  e 


1_ 

2ir 


j  v  S  r  -j“Ti  e'jU,rl  p  *“(V  V  J“Ti) 
|  1=1  N  K+N/r  k=l  ®  ]  j 


du 


2tt 


.  ,S 

du)(-  e 


“J^i  *  MPk-  V  3“Xi 


K+N /I  k=l 


E.  e 


]e 


(5.4-27) 


H  ”  4 


The  writing  of  the  above  expression  is  permissible  because  for 
*-l 


„  y ,  a  we  have 

-nn  d  - 


|2  HT  t  H 

1  — “XX  — 


4/  a*T  f1"'1  q> ,  ($,  a  a  T  +  JL,  )i_^  a 
vd  —  -nn  d  d -  -nn  — nn  —  d 


*  AT  1  "  *T  - 1 

=  $2  (a  1  <f  a)2  +  a  ±  * 

d  —  -tin  —  d  —  — 


—nn 


(5.4-28) 


B-108 


which  is  equivalent  to  the  integrand  appearing  in  Eq.  (5.4-27), 
Eq.  (5.4-27)  is  further  simplified  to 


2a 


o  c  2  ,  K  K  ju(h-i)(T  -  t  ) 

✓  ^  \  \  „  r>  OO 


d“  9  ±h  h?! e 


K  ju>(ti-  Tt)  K  K  jW(Th+  Ph-  Pk+  Tk) 


r,  e 


K+N/I  i-1 


z  ,  i.,  e 
h-1  k-1 


1  .  J  ^(Prx)K  jW(VPh>2 

+  - -  [  z  e  e  ] 

(K+N/I)2  11  h  1 


2a 


■*“  <5>  [K  -  kH77  i£i  A  • 


K  K.  Jw(i-h) (t  -  p  ) 


o  o 


1 


K  K  K.  j«(Tt  -  T±-  T  +  V  Pk  +  T  ) 

i=i  h£i  kh  e 


K  K  ju(i-h)(T  -  T  ) 
K  T  v  -  O  O 

i-1  h=l 


K  K  K  j<u(Ti~  T^-  Th  +  Pk~  Pk+  Tk) 

+  i-1  h-1  kh  e 


h  \  k 


and 


(  * 
tiSi 


^(pi“  V  J  V  ph>  ) 

B  h-1 e 


K  K  j(jj(i-h)  (pQ_  tq)  K  K  jw(k-i)  (to~  pq) 
i=l  h-1  e  k*l  «il  e 


(5.4-29) 


(5.4-30) 


B-109 


-  [K  +  2  .E-  (K-i)  cos  u  i(p  -  t  )) 

1*1  o  o 


[K  +  2  E,  (K-h)  cos  id  h(c  —  T  )  ] 
n5*  i  o  o 


■  K  +  2K  (K-i)[co8  u  i(p  -  i  )  +  cos  w  i(t  ~  p  )) 

1“A  O  O  O  O 


+2  E,  (K-l)  cos  w  i(x  -  t  ) 
1=1  o  o 


K-l  K-l 

+  2  Z  Z  (k-l) (K-h)  |  cos  (Ip  -  It  +  hT  -  hp  ) 
l-i  h-l  J  o  o  o  o 

i-h 


+  cos  (Ip  -  It  -  ht 
o  o  o 


+  hpo)] J 


We  have 


r.  K-l  sin  id  i(t  -  t  ) 

~  a  !o  ,S  2  K  +  2  E  (K-l)  - 2 - V-2- 

'  2r  V  11  a  1(t  -  t  ) 

o  o  o 


(5.4-31) 


K-l  sin  id  1(t  -  t  ) 

Sn  tK  +  2  l£l  (K  -  1)  - 77  -°  ■-  °~J 

a  i  (T  -  T  ) 

o  o  o 


.  ,  K-l  sin  u  i(p  -  t  ) 

- 1— y  [K2  +  2K  E  (K  -  i)  ( - 2 - £■  - .  °- 

(K+N/I)  11  (D  i(p  -  x  ) 

O  0  0 


sin  ..  i(x  -  p  )  K-l 
~Z  L(t  ^V)  }  +  2  ill  <K  "«2 

o  o  o 


ain  os  i ( t  -  t  )  \ 

- o - l 

%  1(V  V  ) 


u:  „  .  K-l  sin  w  i(t  -  p  ) 

+  _L  fK(K-l+N/I)  _  2  j;  r  (v-iT  0 _ 2 _ £_  i  C5  a_ 35  ) 

2v  'n'  (K+N/I)  2  K+N/I  i^l  l>  J  ii} 

wo  iUo~  Po> 

In  the  above  tha  last  terms  on  the  right-hand  sides  of  (5.4-30)  and  (5.4-31) 
have  been  omitted  because  these  terms  always  contribute  very  little  upon  integra- 


B-110 


negligible. 


sin  uot 

tion.  They  are  divided  by  u>  which  is  large  to  make  -  negligible. 

0  U>  T 

O 

At  an  angle  far  away  from  both  the  signal  and  interference  directions,  we  may 
neglect  all  the  oscillatory  terns  and  get 


?„<*»  -  £  <f>2  yV-iap2  .  £  (i)2  Ktt-» 

N  (K+N/I)2  2r  N 


1°.  's\  KQC-l+N/I) 

2rr  *TT  (K+N/I) 


(5.4-33) 


To  the  signal  direction,  tq  » 


7.  <e  -  eT)  -  i  (|)2  k2^/i^ 

T  2*  N  (K  +  N/I) 2 


%  <-£, 2  K2QC-1+N/I)2  1  (K-l)  (2K-1) 

2*  V  (K+N/I)2  3  K(K-1+N/I) 


^  ,S,  K(K-1+N/I 

2ir  V  (K+N/I) 


(5.4-34) 


and  in  the  interference  direction,  t  •  p 

o  o 


7.«-v  g  <*>*  g;<K~lw-2J- 

1  2  N  (K+N/I)2 


!<L  (5-)2  K(K-l)  +  ^  K(K--1+N/JX 

2tt  V  K(K  U  2r  (K+N/I) 


'll  rl)  1  (K-D  lo  ,  S,  K(K-l) 

2*  N  (K+N/Kl) 2  "  211  N  K+N/I 


(5.4-35) 


Although  exact  shapes  of  the  optimal  directivity  pattern  given  by  Eq. 
(5.4-32)  cannot  be  plotted  without  assuming  specific  input  power  levels,  they 
will  take  the  general  form  shown  below  by  comparing  Eqs.  (5.4-33)  through 


B-lll 


(5.4-35) .  There  is  always  a  maximum  In  the  signal  direction  and  a  minimum  in  the 
interference  direction.  Specific  results  will  be  given  in  the  next  chapter. 


B-112 


For  the  adaptive  array  processor  the  tap  spaclngs  are  fixed  throughout  the 


training  period  but  the  weights  are  adjusted  according  to 


Wj+1  “  Sj  ■  2yj  "j  zj  +  2yj  $dC 


(5.5-3) 


where  y^  determines  the  pitch  of  the  algorithm  and  generally  depends  on  the 
time  index  j  ;  the  p's  are  the  delayed  inputs  and  z  is  the  summer  output.  The 
variation  of  the  weights  during  the  training  period  will  thus  determine  the  adapt¬ 
ive  behavior  of  the  processor.  The  filtering  problem  was  studied  in  the  last  two 
chapters  where  the  variations  of  mean  squared  error  as  a  function  of  time  index 
j  and  input  statistics  were  expressed  explicitly.  Here  we  shall  study  the 
detection  problem  by  examining  the  variations  of  the  performance  criteria  during 
the  training  period.  Since  the  input  is  random,  the  weights  expressed  by  (5.5-3) 
■:re  random  and  only  their  expected  values  are  of  significance.  It  has  been  shown 
-  Sect.  3.5,  especially  Eqs.  (3.5-7),  (3,5-35)  and  (3.5-43),  that  the  expected 
/Lues  of  Wj  at  any  stage  is  related  to  their  initial  and  final  values  by  the 
'General  formula 


ElWj+I]  =  Pj  Wx  +  qj  W, 


(5.5-4) 


where  W.  *  W(j=>l)  and  W  =  W(j=“)  ;  p  and  q  are  functions  of  j  and  depend 

— 1  j  j 


on  the  choice  of  the  weighting  sequence  y 


a  weighting  matrix  such  that 

f  ° 


'j 


2(j+l) 


0  '  f" 

K(M+1) 


j  .  For  example,  if  y^  is  chosen  as 


(5.5-5) 


th 


eigenvalue  of  R  ,  i  “  1, 2  ,  •  •  •  ,K(M-4-l) 


then 


1 

J+l 


(5.5-6) 


B-113 


(5.5-7) 


If,  however 

Yj  ”  I  •  I  “  unity  matrix 

then  they  are  of  the  form 


r<J«  -  <  ,  r<J+2  -  W 


(5.5-8) 


(5.5-9) 


"  1  '  PJ 


(5.5-10) 


where  X  ,  and  X  are  the  minimum  and  maximum  eigenvalues  of  the  input 
min  max  ° 

correlation  matrix  R 
- n 

Combining  Eq3.  (5.5-2)  and  (5.5-4)  we  can  write 


— j+I  “  Pj  + 


*-l 


M 


-  p.  a  +  q,  <J>  <t>.  a 

j  ~  j  ~nn  — 


(5.5-11) 


H^(w)  and  H^u)  denote  respectively  the  initial  and  final  forms  of  transfer 
function  vectors. 

Since  in  all  cases  pq  ■  1,  qQ  =  0  and  P<j>  *  0,  qw  *  1,  the  adaptive  processor 
starts  to  be  a  square  law  detector  and  will  be  transformed  gradually  into  an 
optimal  one.  We  can  determine  where  the  adaptive  processor  stands  between  these 
bounds  during  the  adaptation  period  by  substituting  Eq.  (5.5-11)  into  the  expres¬ 
sions  of  various  performance  criteria. 

Basically,  we  are  required  to  evaluate  the  following  three  integrals 


J+l 


2tt 


|g!2  (Pj  Hx  +  qj  HJJ 


$ 

s  s 


(Pj  «1  +  qj  H,)  du 


(5.5-12) 


B-114 


r 

Vi  ■  ^  |G|‘ (  (pj  -1 +  "j  s-)T  Wpj  Si +  ij  a.)’ 

w  — <n 


(5.5-13) 


h  |G|2  <pj  k +  «»j  i>T  (pj  ii +  ^  i)dw 


(5.5-14) 


where  in  £q.  (5.5-14)  the  H's  have  steering  vector  a  rather  the  signal  delays 
a  in  deriving  the  directivity  patterns. 


Since 


(Pj  %  +  H^)”  ♦  (Pj  +  qi  HJ* 


2  T  *  2  T  * 

Pj  Hx  1%  + 


+  2  P  j  q  j  £  £ 


(5.5-15) 


we  shall  evaluate  each  one  of  the  three  Integrals,  Eqs.  (5.5-12)  through  (5.5-14) # 
by  using  Eq.  (5.5-15).,  When  the  first  two  terms  of  Eq.  (5.5-15)  are  used,  the 
results  are  already  available  from  the  previous  two  sections  on  the  initial  and 
final  behaviors. 

Thus,  we  obtain  ,  k  ■  1,2,3,  when  the  kth  term  on  the  right-hand  side  of 

Eq.  (5.5-15)  is  substituted  into  Eq.  (5.5-12) 


.  (1)  2  1.  Irl2  u  T  *  2  -(1) 

Vi  pj  27  1°!  %  ♦  %  -  Pj  ydrC 


*  q,2  7“  lGl2  fli  *  H"  ■  q « *”  y'Tf 

j+1  J  2ir  '  -ss  -®  'd.c 

r  .00 

where  y^  is  given  by  Eq.  (5.3-3)  and  by  Eq .  (5.4-19) 

Q  »  C  Q  *  C 


T  *  2  — («>) 


■ 2  Pj  ’j  H  ’  l®!2  fil  K.  a.  d“ 


1  -1  *T  *T  -1 

2  p.  q.  -r—  d>  a  $ .  a  a  $  a  4> .  dts 
j  j  2tt  d  —  d - -nn  —  d 


(5.5-16) 


(5.5-17) 


B-115 


which  for  broadside  condition  becomes 


K  I  MP,-.,) 

,  K  £  fir  1*1  h-1  _ ,  , 

2  Pj  qj  2n  N  ‘  K  +  N/I  ld“ 


K-l+N/I 

K+N/I 


*  o  .S.  (  K-l+N/I 

2  PJ  qj  2n  Mr  |  K+N/I 

,  K-l  sin  wip  s 

'  iS7i  i*i  TT7  ° 

0  0^ 


sin  (d  i  p. 


if  id  p  >>  1  ,  then 

o  o 


(5.5-18) 


(3)  o  ,S.  K-l+N/I 

Aj+1  3  PJ  qj  it  V  k+n/i 

We  shall  consider  the  second  Integral. 

Note  that 

2 

|  <Pj  %  +  i)T  *  (Pj  Hx  +  ilj*  | 

4  ,„T  .  *.2  ■  4  ,„T  .  *.2 

-  pj  (M.l  *  +  q j  QL  i  SJ 

.  .  2  2 ,  T  .  *.2  ,  ,  2  2 .  T  .  *  ,„T  .  *. 

+  U  Pj  Qj  (Si  £  H*)  +4  Pj  qj  (Hl  *  Hj)  (H,  *.  HJ 


(5.5-19) 


+  2  Pj  qj3  '<£  i  Mj  (Hj  £  HJ 

1  T  *  T  *. 

+  2  PjJ  qj  (Hj  $  HJ  (Hj  £i,) 


(5.5-20) 


Thus  we  obtain  ,  k  -  1,2,***,6,  by  substituting  the  kth  term  on  the  right- 

hand  side  of  Eq.  (5.5-20)  into  Eq.  (5.5-13) 


p  H  — -  |g|  u  (HI  *  H  )  du> 

H  IT  11  —1  -nn  -1 


(5.5-21) 


B-116 


(5.5-22) 


„<2>  *  1 

Bj+1  “  qJ  ttT 


av 


IgI**  (HT  ®  H*)2  du 

1  1  — 00  — nn  -~0D 


where  (o^^)  is  given  by  Eq.  (5.3-9)  and  (c^  J)  by  Eq,  (5.4-22), 


,(“)> 


-,(3)  .221 

b3+i  -  4  Pj  q3  iF 


av 


lG|4  du> 


.221 
4PJ  ^  ^ 


av 


—  5  *T  —1  2 

*  (a  $  4>  a  $  .)  dw 

Td  —  — nn  — nn  —  d 


.22 

-  4  p.  q 


K2ui 


J  nT 


av 


B(A>  -  4  n  2  a  2 

BJ+1  PJ  q3  *T 


av 


*  .  T 


|G|4 


(5.5-23) 


,  2  2  1 
4  pj  qj  if 


av 


r\ 

o 

u 


,  2  2  1 
“  4  pj  q3  *T 


av 


>;2  (»*T  n*T  *>*■ 


5.5-24) 


which  becomes  for  broadside  condition 

,2, ...  K-l 


B 


(4) 

j+1 


sin  to  ip 


2  2/2  K^I/N  .  j  (K  ±)  B  ° _ 

4  p3  q3  \  K  +  wTi  tK  +  i-i  (K  i}  «0  i  P0  1 
,,  K-l  sin  1 

+  W'K  +  4K  A  <K-i)  -^TT 


K-l  K-l  sin  a)rt(i-h)unPn  sin  aio(i+h)po  ^ 

+  2  ih  h£i  .o(i-h).o—  + 


(5.5-25) 


B-117 


Furthermore,  if  «0P0  y>  1*  then 


3+1 


2  2  “o 

1  k2  .  K3I/N 

4  P1  qj  vT  ' 

J  J  av 

[  K  +  "K+N/I 

2  2  Wo 

f  v2  .  I/» 

4  Pj  itT 

J  J  av 

K  +  k+S7i 

K-l  2 


[K 


Bj+1  *  2  q 


3  _J_ 
i  H3  »T 

j  j  av 


-  2  p,  q 


3  i 


j  *T 


av 


|Gl4  (£  *  L)  («X  *Qn  K»>du) 


°(a*T^a)(a*T*nn^^d^ 


2  p,  q 


3  K  S 


1S1  il  N 
J  J  av 


»  I a*Tb | 2 

<K  -  -^7T>dw 


for  t  *  0  ,  the  above  expression  reduces  to 

Ku> 


K-l  sin  wQ  i  P0 


( 5 \  „  3  o  ,S.  -K(K-1+N/I)_  2  j.  (K_t) - £— 

Bj+1  “  2  Pj  Qj  V  [<K+N/I)  1  i»i  %  1  P 

,  3  ,S.  Wo  2  K-l-W/I  „  p  »  i 

=  2  P1  W  “K  K+N/I  10  °P° 

J  J  aV 


B5?1  -  2  pj3  qj  ^ 


av 


Id4  (likaV^Un  &Au> 

X 

2  (a*T  $  a)  (a*T  a  $d)dio 


2  p 


3  _L 

j  itT 

J  a 


— nn 


rw 


n  3  JL. 

p3  qj 


I  (K  +  |  ia*T  b|z)dii) 


Ku)  ..  f  t  K-l.  .Sin^l0o,\ 

-  2  Pj3  Oj  JT-  <!’  \  K  +  5  1K  +  2  1  h^-rT^I 


.5-26) 


(5.5-27) 


(5.5-28) 


B-118 


for  **  0  and 


K2w 


2  P1  qJ  *T  VS'4~  ’  N 
J  J  av 


0  (|)  (1  +  £) 


(5.5-29) 


for  p  <d  »  1  . 
o  o 


We  shall  next  consider  the  third  Integral.  Using  the  k  1  term  on  the  right-hand 
side  of  Eq.  (5.5-14)  in  evaluating  Eq.  (5.5-13),  we  have  c 4_^  ,  k  «  1,2,3 


,•  <1)  _  2  _1_ 
cj+l  pj  2ir 


|G|2  lif  «>  H*  du>  -  p,2  y. 

1  '  —1  “XX  “I  rj  JL 


(5.5-30) 


„(2)  .  n  2  JL 

“j+1  qj  2n 


|g|2  *  H*  dw  -  q  2  y 

i  i  —••oo  ”K3D  1  oo 


(5.5-31) 


where  y,  is  given  by  Eq.  (5.3-22)  and  y  by  Eq.  (5.4-32), 
X  00 


(3)  .  1 

cj+l  ~  2  Pj  qj  2 it 


|g|2  H*  4>  dw 

1  1  — 1  -XX  — 00 


^ '  ° 

TT 


"*T  *T  -1  ~ 

a  (6,  a  a  +  $  )  ®  a,  dui 

_  Td -  -Tin  — nn  — 


j  K  +  |  [a*T  a| 2  -  a*T  a  a*T  b  b*T  a]  J 


du 


(5.5-32) 


Since 


...  K  K  3w(i-h)(x  -  i  ) 

I*  al2  ■  th  hJi  • 


(5.5-33) 


and 


"*T  *T  .  *T  ~ 
a  a  a  b  b  a 


K  K  K  MV  V  Th  +  Ph“  Pk~  V 


i=l  h-1  k“l  6 


B-119 


K  K  jo>(i-h)(t  -  t  ) 

K  i-1  hh  e 


K  K  ju(l-k)(p  -  r  ) 

+  .E.  e  00 

i-1  k-1 


K  K  j«(k-h)(p  -  t  ) 

+  K  .  E.  ,E.  e  00 

h-1  k-1 


K  K  K  ja(T  -  t  -  t,  +  p  -  p  -  r  ) 
1  i  n  n  k  k 


2  K  +  .£,  .£.  e 

i-1  h-1  k-1 


1  **  h  **  k  (5.5-34) 

Eq.  (5.5-32)  reduces  to  the  following  expression  by  omitting  the  contribution  due 
to  the  last  term  on  the  right-hand  aide  of  Eq.  (5.5-34) 

.o> 


CjJ 


Cj+1  “  Pj  qj  TI  fK  +  N  K  “  N  K+N/I  *3K  “  2K)-* 


wn  „  K-1 

+  2  PJ  qj  -  I  iil  <K"1) 


sin  u  i(x  -  t  )  „ 

,„.,R _ o  -P-  (i  +  K 


u  i(t  -  t  ) 

O  0  0 


K+N/I 


) 


K  fsln  "o  i0V  To>  Sln  “o  1(Po"  T-) 

k+n/i  1  ,  ^  <*»_  i(p. 


w  i  (p  -  x  ) 

0  0  0 


-  t  ;  \ 

^7’} 


(5.5-35) 


Just  as  we  did  in  expressions  and  for  three  special  cases,  we  shall  also 

(3) 

evaluate  in  a  similar  fashion.  When  the  array  is  steering  in  a  direction 

far  away  from  that  of  target  and  interference,  we  neglect  all  the  oscillatory 
terms  and  get 


<»> 


Pjqj*  KU  +  N  '  N  K+N/I  ^3K  “ 


In  the  target  direction, 

C  j+1  (S  =  °T)  =  pj  qj  ~  CK  +  N  K  "  N  K+N/I  (3K  '  2K^ 

K-1 


qj  <t>  /  &  <*-*> 


(5.5-36) 


B-120 


-  _  _  “o  .  S  2R-4K+2-HCN/I . 
P]V  K{1  +  N  'k+N/I -  ] 


(5.5-37) 


and  in  the  interference  direction 


CJ+1  “  6I*  P j  3-j  „  +  ij  K  ~  N  K+N/I  *3iC  '  2K)  I 


+  2  Pj  qj  V  tt  K+N/I  ill  (K_i> 


„  -  -2.  RM  +  S  K  ~  3K+2+N/I. 
Pj  qj  *  KI1  +  N  M?i  1 


(5.5-38) 


Now  we  are  ready  to  express  the  performance  variations  during  the  training  period. 
(1)  Output  signal-to-noiae  ratio 


SNR 


■J+l 


Z  A<k> 
k-1  Al+i 


(  l  B(k)V1/2 

1  k£i  Vij 


(5.5-39) 


where  the  A's  are  given  by  Eqs,  (5.5-16)  through  (5.5-19)  and  the  B’s  by  F.qs, 
(5.5-21)  through  (5.5-29) 


(2)  Directivity  Pattern 

v  „(k) 

yj+l  k«l  j+1  (5.5-40) 

where  the  Cs  are  given  by  Eqs.  (5.5-30),  (5.5-31)  and  (5.5-35).  For  any  parti¬ 
cular  steering  direction  y  ia  readily  computed  from  Eqs.  (5.3-23)  through 
(5.3-25),  (5.4-33)  through  (5.3-35),  and  (5.5-36)  through  (5.5-38).  Various 
computations  are  shown  in  Chapter  Six. 


B-121 


CHAPTER  SIX 


COMPUTER  SIMULATIONS  AMD  NUMERICAL  EXAMPLES 

6. 1  Introduction 

A  great  deal  of  attention  has  been  given  to  proving  that  iterative  pro¬ 
cedures  described  in  previous  chapters  converge  under  certain  conditions.  Having 
proved  convergence  of  the  adjustment  schemes,  our  problem  is  to  demonstrate  that 
the  procedures  are  feasible;  that  is,  solutions  can  be  obtained  by  using  the 
adjustment  procedures  in  a  reasonable  amount  of  time.  To  establish  this  point, 
computer  studies  were  made  to  the  design  of  adaptive  tapped-delay-line  filters 
and  detectors.  Recall  that  the  approach  to  adaptive  receiver  design  has  been  an 
optimal  one.  The  adaptive  design  is  a  result  of  realizing  the  optimum  receiver 
in  a  sequential  manner. 

In  this  chapter  we  consider  simulating  an  adaptive  processor  on  a  digital 
computer  for  a  rather  specific  case  to  observe  how  the  processor  performances 
vary  with  time.  Several  examples  have  been  worked  out  by  digital  computer  simu¬ 
lations  to  verify  all  the  theoretical  analyses  presented  in  Chapter  III. 

Some  computations  have  also  been  carried  out  to  show  the  performances  of 
an  adaptive  detector  described  in  Chapter  V.  These  computations  were  done  based 
on  theoretical  analysis  rather  than  simulation  which.  In  this  case,  would  require 
coo  much  computing  time  without  providing  any  general  conclusions.  All  these 
numerical  examples  were  worked  out  on  the  IBM  7094  11-7040  direct-coupled  system 
at  the  Yale  Computer  Center. 

6. 2  Computer  Simulations 

An  arbitrary  array  processor  was  used  here  as  an  example  to  demonstrate  the 
properties  of  the  two  algorithms  given  in  (3.3-6)  and  (3.3-13),  i.e.,  the  algo¬ 
rithms  u9ing  desired  signal  and  signal  correlation  function,  respectively. 

A  linear  array  of  six  uniformly  spaced  isotropic  hydrophones  was  assumed  to 


B-122 


I 


be  influenced  by  the  following  set  of  signals: 

1.  A  planar  target  waveform  incident  from  the  broadside. 

2.  A  single  interfering  planar  noise  waveform  incident  at 
angle  0j  «  120°  . 

3.  A  "white"  gaussian  noise  source  at  each  hydrophone  repre¬ 
senting  the  ambient  noise  which  is  assumed  to  be  uncorrelated 
from  hydrophone  to  hydrophone. 

The  hydrophones  were  spaced  c/u>q  units  apart,  where  c  is  the  velocity  of 
propagation  in  the  isotropic  medium  and  u)q  is  the  center  frequency  of  the 
target  signal.  The  output  of  each  sensor  was  processed  using  a  tapped-delay 

line  containing  ten  multiplying  weights  and  nine  ideal  time  delay  of  (-=i  ) 

w0 

seconds  each,  Because  the  target-signal  waveform  was  incident  from  the  broad¬ 
side  direction,  the  target  signal  arrived  simultaneously  —  i.e.,  in  phase  — 
at  the  output  of  all  six  hydrophones. 

The  target  signal  and  the  Interference  were  modeled  as  a  broadband 
gaussian  random  processes.  At  each  hydrophone  the  ratio  of  target-signal  vari¬ 
ance  to  total  noise  variance  is  0.01.  All  these  properties  were  generated  by 
passing  a  pseudo-random  gaussian  sequence  through  an  appropriately  designed 
digital  filter.  Signal,  noise,  and  interferences  were  generated  as  sequences 
of  random  numbers  from  random  number  generators.  The  sequences  were  then  trans¬ 
formed  from  rectangular  distribution  to  normal  distribution  using  existing 
programs.  Each  simulation  started  with  an  initial  weight  vector  having  all  com¬ 
ponents  associated  with  no-delays  set  to  unity  and  the  rest  to  zero.  The  weight 
vector  was  then  adapted  using  the  appropriate  iterative  equations . 

Y_ 

Throughout  the  study  the  sequence  y^  was  determined  as  y 


2(j+l>  ’ 


being  a  variable  parameter.  The  behavior  of  the  process  depends  critically 
on  the  parameter  y^  ;  hence  each  case  considered  was  carried  out  for  a  number 


B-123 


of  values  of  y  .  As  yq  was  increased,  the  convergence  of  the  process  In¬ 
creased  steadily  until  a  point  was  reached  for  which  the  process  would  break  into 
violent  oscillations  that  took  a  long  time  to  die  ort.  This  point  was  predicted 
in  Chapter  111  that  y^  si.  .uld  satisfy 


0  < 


< 


1 

ux 

max 


for  all  J  ,  especially  at  the  starting  moments.  Since  the  largest  eigenvalue 


^max  was  not  ^nown  a  priori,  the  best  weighting  sequence  y^  can  only  be  chosen 


by  experiments. 


As  a  check  case,  the  optimum  (mean-square  sense)  values  of  the  coefficients 

were  found  by  correlation  techniques,  by  averaging  the  necessary  values  of  the 

correlation  functions  R  (t)  and  R  (t)  over  an  interval  of  2000  samples.  This 

s  n 

set  of  coefficients  is  compared  in  Table  1  with  the  sets  of  coefficients  obtained 
by  the  approximation  method  for  the  two  algorithms  chosen.  The  average  filtered 
error  as  measured  by  the  algorithms  over  the  2000-sample  interval  is  plotted 
against  time  index  during  the  adaptation  period.  This  is  shown  in  Fig.  8  where 
the  minimum  mean  squared  error  with  the  optimum  weight  is  denoted  by  a  horizontal 
lino.  The  smooth  curve  indicates  the  mean  squared  error  calculated  by  theoretical 
analysis. 

Fig.  9  shows  how  the  weight  vector  approaches  its  optimum  point  independent 
of  the  initial  settings. 

Figs.  10a  and  10b  show  that  faster  rate  of  convergence  can  be  obtained  by 
increasing  y ^  if  y^  »  2TT+T)  *  not  t0°  ^ar5e  t0  cause  oscillation.  When 

constant  weighting  sequence  was  used  (Fig.  10a),  the  mean  squared  error  at  later 
adaptation  stages  would  oscillate  around  the  minimum  rase  instead  of  approaching 
It  gradually.  For  a  single  filter  and  known  correlation  functions,  the  relation¬ 
ships  between  the  rate  of  convergence  and  the  maximum  eigenvalue  are  shown  in 


B-124 


Fig.  10b.  As  evidenced  by  Table  Jt  and  Figs.  £  and  10,  the  filter  designs  for  the 
two  algorithms  are  equivalent  In  the  sense  that  they  result  in  nearly  equal  values 
of  average  filtered  error.  The  average  filtering  errors  as  obtained  by  these  two 
algorithms  over  the  2000-sample  interval  were  only  a  few  percent  greater  than  that 
for  the  filter  designed  by  correlation  techniques.  In  view  of  the  limitation  on 
the  length  of  data  available,  this  performance  is  entirely  satisfactory. 

The  important  point  to  be  brought  out  is  that  the  total  computing  time  re¬ 
quired  by  any  one  of  the  two  adjustment  procedures  for  finding  the  optimum  set  of 
coefficients  was  no  greater  than  the  computing  time  required  to  measure  the  nec¬ 
essary  correlation  functions  and  solve  the  associated  set  of  simultaneous  equa¬ 
tions  for  the  minimum  mean-square  error  coefficients.  Thus  the  adjustment  methods 
are  no  more  trouble  to  apply  than  correlation  techniques,  yet  they  eliminated  the 
requirement  of  a  priori  statistics  - 

The  effect  of  uncertain  elgnal  is  shown  in  Fig.  11.  We  see  that  if  the 
assumed  signal  power  differs  from  the  actual  power  by  a  multiplicative  constant, 
the  gains  adjusted  according  to  algorithm  (3.3-16)  will  converge  to  their  opti¬ 
mum  values  multiplied  by  the  same  constant. 

An  attempt  was  made  to  compare  the  rate  of  convergence  for  two  different 
approaches:  the  Kalman  filtering  technique  using  all  the  past  information  (see 

Section  4.3)  versus  the  ordinary  method  of  stochastic  approximation.  As  expected, 
the  Kalman  technique  gives  a  faster  rate  of  convergence.  Some  of  the  reasons 
were  given  in  Section  4.3.  See  Fig.  12. 

There  is  no  general  method  to  select  the  right  number  of  taps  so  that  a 
predetermined  accuracy  can  be  achieved  for  any  given  system.  Several  runs  were 
made  to  plot  the  minimum  mse  versus  the  number  of  taps.  It  was  found  that  by 
properly  adjusting  the  tap  spacings,  five  or  six  taps  could  produce  quite  satis¬ 
factory  results.  One  plot  is  shown  in  Fig.  13. 


O' 
— * 

O 


o 

o 

o 


fO  H 

o  o 
o  o 

d  o 

i  t 


o 

o 

d 


qo 

-s 

u 


o 

o 

d 


Tf  rO  — ♦ 

O  O  O 

O  O  O 

do  d 


rO  — l  04 

O  O  o 

O  o  o 

do  d 

i 


o 


o 

o 

o 


s  ° 

d  d 

i 


o 

o 

o 

o 


o  oo 

O' 

1  o 

— <  o 

^  o 

o  o 

o 

r— (  j 

,  „ 

, 

U  1  O 

o  o 

o 

(vj  r- 

m 

t  o 

rsl  rJ 

rO 

^  o 

O  o 

o 

— ' ' 1 

§  . 

c 

0 

d  o 

o 

o 

on  r- 
m  kj< 

o 

>n 

ro 

o 

o  o 

o 

,  , 

. 

U 

o 

o  o 

o 

OJ 

o 

LT> 

O  O 

co 

o 

o 

—<  f-H 

— < 

. — i 

# 

, 

u 

o 

o  o 

o 

(M 

o* 

O 

r- 

v£) 

00 

f—A 

O 

r\l 

oo 

fM 

—A 

u 

o 

o 

o 

O 

I 

in  — i 

— <  <V1 

m 

>— 4 

o 

o 

o  o 

o 

>  *~<  J 

o 

00  ro 

ro 

u  1 

(a)  0. 

(b)  0. 

o 

o 

o 

o 

i-H 

M 

4) 

I  U 

II 

d 

* — A 

1  ' 

■  > — > 

rcJ 

»4-« 

0) 

o 


U 


rt 

</) 


nJ 

c 

lu 


> 


Dh 

O 


4-> 

GQ 


■c 

4-i 

h 

O 


d 

0) 


04 

O 

°  h 

'o  S 

d  3 

°  ^ 
(0  *3 

-rl  RJ 

H  3 
<d  T» 
Q.  •£ 

6  | 
o  .8 


T3 


0) 
f — I 

x> 

rt 

H 


B-126 


Variation  of  mean-squared  error. 


MSE  variation  vs.  weighting  constant 


Figure  11.  Effect  of  uncertain  signal  power 


li-lol. 


igure  12.  Comparison  of  rates  of  convergence 
(A)  Stochastic  approximation 
{3}  Kalman  filtering  technique 


o 


Best  Available  Copy 


Figure  M.  Variation  of  detector  output 


When  this  simulated  array  proceaaor  was  used  as  a  detector,  the  detector 
output  was  examined.  After  2000  samples  of  adaptation,  the  weight  vector  was 
held  fixed  and  the  same  2000  samples  were  sent  to  the  detector.  The  input  con¬ 
tained  noise  only  at  the  beginning  of  this  operation  and  contained  both  signal 
and  noise  starting  from  t  «  1000.  Fig.  14  shows  clearly  how  to  Interpret  the 
detector  output  and  decide  the  presence  of  a  target. 

6.3  Experimental  Results 

Sonar  noise  recorded  at  sea  from  a  collection  of  6  hydrophones  has  been  U9ed 
to  test  the  Iterative  rules  described  in  previous  sections.  An  IBM  1800  computer 
was  used  to  make  data  tapes  compatible  with  an  IBM  7094-7040  system  which  was  used 
as  the  principal  computational  tool  in  the  experiments. 

The  noise  was  a  2-second  noise  sampled  every  1/8000  second.  The  total  hand- 
width  of  the  data  was  about  425  Hz  to  2400  Hz.  The  hydrophones  in  a  linear  array 
were  separated  by  7.5  inches. 

Since  the  data  were  collected  in  actual  sea  water,  the  directionality  of  the 
noise  field  was  not  known  exactly.  From  the  display  of  correlation  functions  of 
several  channels  (See  Fig.  15)  there  seemed  to  be  some  interfering  source  present 
in  addition  to  the  ambient  noise. 

In  order  to  show  how  the  processor  eliminates  background  noise,  a  target 
signal  was  produced  from  a  noise  generator  and  passed  through  a  filter.  The 
signal  autocorrelation  function  is  shown  in  Fig.  16.  Three  different  signal 
directions  were  tested,  i.e.,  direction  A  (opposite  to  the  assumed  noise  direc¬ 
tion,  direction  B  (perpendicular  to  the  assumed  noise  direction),  direction  C 
(similar  to  the  assumed  noise  direction).  After  proper  signal  delays  we  obtained 
three  different  noise  correlation  matrices  whose  cross-correlation  coefficients 
for  the  six  channels  are  tabulated  in  Table  2. 

It  is  seen  from  Table  2  that  for  direction  B  the  actual  noise  correlation 
matrix  consists  of  many  oscillatory  terms.  If  signal  delays  were  inserted  to 


B-135 


align  target  coming  from  direction  C,  the  Interference  was  also  in  phase  so  that 
all  the  cross-correlation  coefficients  are  positive.  On  the  other  hand,  if. signal 
delays  were  Inserted  to  align  the  target  coming  from  direction  A,  the  noise  field 
was  further  de-correlated .  Consequently,  the  array  gain  defined  by 

■  slgnal-to-nolse  ratio  of  the  a»wnwer  output 
slgnal-to-nolse  ratio  at  channel  1 

was  large -Jt  if  the  target  came  from  direction  A  and  least  if  both  the  target  and 

the  noise  came  from  the  same  direction.  This  is  shown  in  the  second  row  of  Table 

1 

3.  Note  that  the  slgnal-to-noise  ratio  in  this  experiment  has  been  defined  as  the 
signal  pover  divided  by  the  noise  power,  rather  than  the  d.c.  change  of  the  output 
due  to  the  presence  of  the  target  divided  by  the  rms  fluctuation  of  the  output. 

The  later  definition  is  more  meaningful  for  the  detection  problem  and  the  former 
definition  is  useful  for  the  problem  of  signal  extraction.  Several  casea  were 
studied  on  how  our  proposed  adaptive  array  processor  eliminated  the  undesired 
noise.  Complete  results  are  shown  in  Table  3,  where  SNRin  Is  the  Input  slgnal- 
to-noise  ratio  at  Channel  . 1,  SNR^  is  the  output  slgnal-to-nolse  ratio  of  the  con¬ 
ventional  processor,  SNRft  is  the  final  output  slgnal-to-noise  ratio  of  the  adap¬ 
tive  processor  after  2000  iterative  adjustments  and  using  6  taps  in  each  individ¬ 
ual  filter,  SNR^  Is  the  same  as  SNR^  except  that  twelve  taps  were  used  for  each 
individual  filter.  So  far  we  have  assumed  that  the  reference  or  desired  signal 
d(t)  is  the  same  as  the  target  signal,  i.e.,  d(t)  «  a(t).  If  d(t)  is  replaced 
by  some  delayed  version  of  s(t),  i.e., 

d(t)  •  s(t  -  t) 

then,  for  some  proper  choice  of  r  ,  smaller  mean-squared  error  or  larger  signal- 
to-noise  ratios  nay  result.  Further  discussions  on  this  point  can  be  found  in 
[lj.  For  ?  j*  0  ,  the  corresponding  SNR^  and  SNR^  are  denoted  by  SNR'^  and 
SNR’12  .  The  improvement  of  SNR  produced  by  the  adaptive  processors  over  that 


B-136 


B-137 


Figure  15.  Noise  correlation  functions 


(.t)  Channel 

i 

2 

3 

4 

5 

6 

I 

1 . 00 

0.  70 

0.  55 

0.  56 

0.  43 

0.  24 

2 

0.  70 

1. 00 

0.  64 

0.  58 

0.  40 

0.  24 

3 

0.  55 

0.64 

1. 00 

0.68 

0.  44  ’ 

0.  21 

4 

0.  56 

0.  58 

0.  68 

1.00 

0.  58 

0.  26 

5 

0.  43 

0.  40 

0.  44 

0.  58 

1. 00 

0.  34 

6 

0.  24 

0.  24 

0.  21 

0.26 

0.  34 

1. 00 

(b)  Channel 

1 

2 

3 

4 

5 

6 

t 

1. 00 

0.  42 

-0.  27 

-0.  13 

0.  10 

0.  089 

> 

0.  42 

1.C0 

0.  36 

-0.  34 

0.  05 

0.  1 1 

3 

-0.  27 

0.  36 

1. 00 

0.  22 

-0.  23 

0.  10 

4  . 

-0.13 

-0.  34 

0.  22 

1. 00 

0.  29 

-0.  13 

5 

0.  10 

0.  05 

-0.  23 

0.  29 

1. 00 

0.  27 

0 

-0.  089 

0.  1  1 

0.  10 

-0.  13 

0.  27 

l.  00 

(<)  Clumii'! 

1 

2 

3 

4 

5 

6 

J 

1 . 00 

-0.  36 

0.  21 

-0.  13 

0.  12 

-0.  1  1 

y 

-0.  30 

l .  00 

-0.  29 

0.  24 

-0.  21 

0.  07 

J 

0.21 

-0.  29 

1 . 00 

-0.  33 

0.  1  5 

-0.  1  3 

1 

0.  !  i 

0.  24 

-0.  33 

1 . 00 

-0.  24 

0.  12 

*> 

0  12 

-0.  2! 

0 .  15 

-0.  24 

1 . 00 

-0.  07 

(» 

-0.  I  l 

0.  07 

-0.  1  3 

0.  12 

-0.  07 

1.00 

!  A  i « i .  / 

i  .  t  t  )  t  ■ 

;  ij'ji:*  i 

(b) 

*it  - « 

»  i  HIM  5SJ* 

■« • «  I  *t * M  ! 

<  * y  !.! 

from 

!*..  (•  > 

3  K  i!  on 

(a)  Dire 

1 )» i  ■  i  !  i, 

J ;  i  i  iinl: 

ction  C, 
>n  A. 

for 

pi 


B-138 


RdM 


mtPHsvmwmipp 


\C;i  t>  e  s 

lte ms 

Direction  C 

Direction  B 

SNR 

0. 002431 

0. 006676 

c 

SNR 

snrin 

2.  38 

6.  55 

SNRO. 

6 

0.  003967 

0. 009556 

SNRO* 

O 

0.  004135 

0. 010772 

SNRO 

1  Cm 

0. 004001 

0. 009887 

SNRO'i2 

0. 004187 

0. 010955 

SNRO 

6 

SNR 

c 

2.  13  db 

1. 56  db 

SNR  o! 

t) 

~~SNR 

c 

SNRO 

SNR 

c. 

Z. 31  db 

2. 06  db 

Z. 1 8  db 

1.73  db 

SNRO'  , 

1  tC 

SNR 

i 

2.  36  db 

2 .  15  db 

TAI1LI'  3. 

)■'  >,  j h‘  r  j  me nt . 1 1 

results  (SNR 

IN 

■  ill  l  .1  e  :.  ) 

t 


Direction  A 

0. 012054 

11.8 

0. 02594 

0. 03179 

0. 02774 

0. 03293 

3. 18  db 

4.  24  db 

3.  63  db 

4.  41  db 

.  0010  39  for 


B-140 


by  the  conventional  processor  such  as  SNR’, /SNR  are  measured  In  db.  These  results 

o  c 

shewn  here  are  remarkably  close  to  the  optimal  filtering  using  complete  Input 
statistics  (performed  independently  by  R.  Kneipfer  of  U.S.  Underwater  Sound 
Laboratory,  New  London,  Connecticut). 

6.4  Numerical  Computations 

To  Investigate  how  an  adaptive  detector  changes  its  performance  during  the 
training  period,  we  could,  in  principle,  simulate  such  a  processor  In  digital 
computers.  However,  there  exist  some  practical  difficulties.  Since  detection 
performances  (output  SNR,  directivity  patterns)  are  functions  of  output  mean 
and  variance,  at  each  stage  of  the  training  process  we  are  required  to  calculate 
the  output  and  variance  using  sufficiently  large  numbers  of  samples  (say,  1000 
or  more)  for  ,  where  j  -  1,2,,..,  number  of  test  samples.  Furthermore,  if 
we  want  to  change  any  one  of  the  many  system  parameters  such  as  number  of  hydro¬ 
phones,  number  of  taps,  or  input  statistics,  the  whole  process  would  have  to  be 
repeated . 

In  light  of  the  above  difficulties,  analytic  expressions  were  derived  in 
Chapter  V  to  determine  how  the  adaptive  detector  performs  for  a  specific  case 
in  which  the  input  spectra  are  identical  over  a  certain  frequency  range.  Equations 
(5.5-39)  and  (5.5-40)  are  uaed  extensively  to  carry  out  numerical  computations. 

Fig.  17  showB  the  variation  of  output  signal-to-noise  ratios  during  the 
adaptation  period. 

If  target  and  interference  are  well  separated  in  bearing,  the  (normalized) 
directivity  patterns  are  shown  in  Fig.  18.  In  Fig.  18  computations  were  made 
using  approximate  expressions  in  the  target  direction  (9  =  0^)  ,  interference 
direction  (6  ■  9^)  ,  and  remote  from  both.  Optimal  (j  =■  ~)  behaviors  of  the 
array  processor  as  a  function  of  other  system  parameters  have  been  considered 
previously  by  Schultheirs  [32]  and  are  not  plotted  here. 


Experimental  results  using  sonar  data  have  verified  that  practical  adap¬ 
tive  array  processors  can  perform  nearly  as  well  as  optimum  processors  in  a 
stationary  environment.  It  should  ba  possible  to  adopt  similar  iterative 
processors  to  seismic  and  electromagnetic  arrays  which  operate  in  a  directional 
noise  environment.  It  might  be  possible  to  minimize  reverberation  as  well  aa 
ambient  noise  in  systems  where  reverberation  is  significant. 


B-142 


2 


Figure  1 


6  8  10  12  14 

uopQ 


.  Variation  of  Output  SNR 


B-143 


CHAPTER  SEVEN 


SUMMARY,  CONCLUSIONS,  AND  SUGGESTIONS  FOR  FUTURE  RESEARCH 
7.1  Summary  and  Conclusion 

The  research  described  herein  has  developed  a  system  for  processing  the 
outputs  of  a  passive  array  of  hydrophones.  The  system  consists  of  an  adaptive 
linear  multichannel  filter,  together  with  algorithms  for  iterative  adjustment 
of  the  weights  on  the  topped-delay  lines.  It  is  designed  to  process  the  re¬ 
ceived  wavefront  in  the  presence  of  ambient  noise  and  interferences.  The 
system  is  designed  in  such  a  way  that  It  can  be  readily  implemented  and  be 
able  to  operate  well  in  real  time  in  the  presence  of  noise  fields  whose  statis¬ 
tics  are  unknown  a  priori, 
a)  Assumptions 

The  development  and  analysis  of  the  array  processor  presented  in  this 
research  has  been  based  on  the  assumptions  that 

(1)  Target,  interferences  and  ambient  noise  are  assumed 
to  be  gauss ian  random  processes . 

(2)  The  sum  of  interferences,  ambient  noise  and  local 
noise  are  regarded  as  the  effective  noise,  which 
is  assumed  to  be  statically  independent  of  the 
target  signal. 

(3)  The  target-signal  components  s.(t)  observed  at  the 
outputs  of  the  ith  hydrophone  is  a  linear  time- 
invariant  transformation  of  d(t) ,  the  target-signal 
component  observed  at  the  output  of  an  ideal  isotropic 
hydrophone  located  at  the  origin  of  the  coordinates. 

The  target  direction  is  known,  together  with  its 
autocorrelation  function  (but  not  necessarily  Its 
power  level) . 

(4)  The  statistics  of  the  noise  field  are  completely 
unknown.  Interferences  may  be  present,  but  this 
is  unknown.  If  they  are  present,  their  directions 
are  unknown. 

(5)  The  wavefronts  of  target  and  interferences  are 
regarded  ae  plane  over  the  dimensions  of  the  array. 


B-145 


(6)  Tho  processor  is  a  directional  array  whose  gain  is 
maximized  in  tho  direction  from  which  the  target  is 
expected  to  come. 

In  Chapter  V,  in  order  to  analyze  the  performance  of  the  proposed  processor, 
it  was  further  assumed  that 

(7)  The  array  is  linear  and  consists  of  equal  spaced 
hydrophones. 

(ft)  .he  ambient  noise  is  statistically  Independent  from 
hydrophone  to  hydrophone. 

(9)  The  input  processes  are  band  limited  and  of  similar 
.spectra. 

b)  .  S nonary  of  Results 

A  mathematical  model  has  been  developed  to  describe  the  characteristics  of 
the  input  processes  and  the  processing  mechanisms.  This  model  has  been  used  to 
examine  the  array  processor  when  the  filter  coefficients  are  adjusted  iteratively 
so  as  to  optimize  the  processor  in  accordance  with  the  following  performance 
nousurcs: 

(1)  Minintm  mean-squared  error  between  the  beamformer 
output  and  th*_  target  signal  for  the  filtering 

problem . 

(2)  Maximum  s ignal-to-nolse  ratio  at  the  processor  out¬ 
put  for  the  detection  problem. 

For  a  general  array  configuration  consisting  of  individual  filter  on  each 
hydrophone  utput,  a  post-sumnnt ion  filter,  a  square-law  device,  und  an  averaging 
filter,  the  optimum  individual  fllttcs  can  be  constructed  by  tapped-delay  lines 
with  the  weights  set  to  son.:  optlruii  values.  Although  these  optimal  values  can¬ 
not  be  determined  without  complete  knowledge  about  both  the  target  and  the  noise, 
methods  of  stochastic  approximation  car.  be  applied  to  adjust  the  weights  itera¬ 
tively.  Tho  only  information  required  in  using  the  adaptive  algorithms  *3  the 
correlation  functions  between  the  wavefront  and  the  various  delayed  signals. 

The  proposed  algorithms  have  been  shown  to  converge  in  mean  square  and  in 


* 


probability  as  long  as  the  second  order  statistics  of  the  input  processes  are 
bounded.  Explicit  expressions  for  the  rate  of  convergence  are  derived  in  terns 
of  input  statistics,  various  system  parameters  and  training  environment.  The 
mean-squared  error  is  found  to  decrease  approximately  as  the  first  power  of  the 
adaptation  time.  The  rate  of  convergence  is  essentially  indifferent  to  the 
number  of  weights  to  be  adjusted  as  our  algorithm  allows  simultaneous  adjustments. 
The  size  of  error,  however,  depends  on  the  total  number  of  taps  and  the  starting 
point.  Ranges  of  the  weighting  sequence  are  determined  to  maintain  stability  of 
the  adaptive  loop.  It  is  also  of  interest  to  note  that  there  is  no  signal  sup¬ 
pression  phenomenon  in  using  our  algorithm  and  that  the  final  system  performance 
is  independent  of  the  signal  power  level. 

Several  partially  effective  techniques  have  been  proposed  to  adjust  time- 
varying  parameters.  It  is  also  found  that  the  ordinary  methods  of  stochastic 
approximation  can  still  provide  convergent  algorithms  if  the  rate  of  parameter 
variation  is  sufficiently  Blow.  Qualitative  discussions  are  provided. 

The  performances  of  the  proposed  adaptive  receiver  are  evaluated  and  com¬ 
pared  with  those  of  the  non-adaptive  systems.  The  whole  system  starts  to  be  a 
conventional  detector  and  is  gradually  transformed  into  a  spare-time  filter 
optimum  in  a  predetermined  direction.  This  optimum  filter  is  shown  to  reduce 
disturbances  coming  from  other  directions.  When  a  signal  appears  in  this  parti¬ 
cular  direction,  a  maximum  response  will  be  produced.  In  actual  operation,  the 
average  bearing  response  can  be  obtained  from  a  plot  of  the  averaged  squared  out¬ 
put  versus  the  looking  angle  of  the  array.  In  most  practical  situations  narrow 
peaks  are  considered  to  be  targets. 

7.2  Suggestions  for  Future  Research 

The  following  problem  areas  have  been  suggested  by  the  research  reported 

here: 

a)  Applications  in  Other  Areas 

Much  work  remains  to  be  done  in  other  areas  of  application.  New  areas  of 


B-147 


application  should  be  explored  both  from  e  theoretical  standpoint  and  from  a 
practical  one.  Two  iaportant  areas  are  seismic  arrays  and  satellite  coumuni ca¬ 
tions  . 

In  processing  seisalc  data  the  direction  of  the  source  is  generally  known 
because  of  the  impulse  nature  of  the  initial  signal.  The  direction  and  nature 
of  seismic  noise  are  not  easily  determined.  The  iterative  procedure  suggested 
in  this  research  could  be  used  to  minimize  the  effect  of  such  noises. 

The  suggested  system  presented  here  might  be  used  to  improve  the  signal- 
to-nolse  ratio  for  communication  signals  received  from  transmitters  located  on 
deep  space  probes.  Presumably,  the  direction  of  the  source  is  known  (e.g.,  the 
location  of  a  satellite) ,  but  the  characteristics  of  the  interfering  noises  are  , 
unknown.  The  improvement  offered  by  the  array-processing  system  presented  here, 
as  compared  with  conventional  systems,  might  be  appreciable. 

Detailed  analysis  of  the  above  two  areas  will  be  very  useful  and  important 
in  understanding  the  performances  of  these  adaptive  systems . 

b)  Nona tat ionary  Problems 

The  applicability  of  adaptive  techniques  to  statistically  nonstationary 
processes  presents  some  highly  challenging  mathematical  and  statistical  problems, 
and  perhaps  ia  the  one  in  which  the  strongest  applications  of  adaptive  techniques 
will  be  made.  In  this  research  some  procedures  have  been  proposed.  But  they 
are  applicable  only  to  special  cases.  A  generalized  formulation  to  handle  this 
problem  would  be  highly  desirable. 

c)  Automatic  Recognition  of  Bearing  Response 

In  applying  the  proposed  algorithm  to  actual  sonar  systems,  an  operator 
is  needed  to  interpret  the  bearing  response.  One  would  like  to  ask  whether  or 
not  an  automatic  response  reader  can  be  constructed  by  studying  the  character¬ 
istics  of  directivity  patterns  and  by  developing  some  recognition  algorithms. 


B-148 


APPENDIX  A 


THE  OPTIMUM  DETECTOR  FOR  DETECTION  OF  A  GADS SIAM 
SIGHAL  IN  GADS SIAN  HOISE  BACKGROUND 

Suppose  that  the  array  consists  of  K  hydrophones,  and  that  the  received 
signal  at  the  iCh  hydrophone  ia  x^(t) 

x1<t)  »  81(t>  +  n^(t)  ,  i  ••  1,2,***,  K.  (A-l) 

where  s^t)  Is  the  signal  that  would  be  observed  at  the  1th  hydrophone  if 
there  were  no  noise,  and  n^(t)  is  the  noise  which  includes  both  ambient  noise 
and  Interferences.  Both  s^(t)  and  n^(t)  are  assumed  to  be  Gaussian  random 
processes  with  zero  mean  and  so  is  the  input  x^(t)  .  If  the  spectrum  of 
x^(t)  is  limited  to  frequencies  below  uq  cps,  and  the  x(t)  are  observed 
over  an  interval  T  ,  such  that  »  1  ,  then  x^(t)  can  be  expanded  in  a 

Fourier  series 


x^t) 


U)  T 
o 

r. 

n«-u  T 

O 


*1(n)eJ2lmt/l 


(A-2) 


where  x^(n)  are  complex  Fourier  coefficients  satisfying  x^(-n)  “  x*(n)  and 
where  the  asterisk  stands  for  complex  conjugate.  It  is  seen  that  all  the 


available  Information  about  the  signals  received  by  the  entire  array  is  con¬ 


tained  in  the  set  of  vectors 


x^n) 

x2(n) 

X(r.)  *  '  (A-3) 


Xj^Cn)  ■ 

Following  [6)  and  {7],  we  assume  that  X(n)  and  X(m)  are  statistically  inde¬ 
pendent  for  n  ^  m  ,  By  the  same  token,  we  let  the  signal  component  of  ^(t) 


B-149 


be  given  by 


Sl(t)  "  n«-u  T  sl(t)e 
o 


j  2irnt/T 


(A-4) 


that  the  signal  at  all  hydrophones  is  represented  by 


S  (n) 


81(n) 

82(n) 


(A-5) 


:  sK(n) 

Here  again  we  assume  that  S^(n)  is  independent  of  S^m)  for  n  ^  m  . 

The  optimum  detector  is  known  to  be  the  likelihood  ratio  detector,  which 
determines  the  presence  or  absence  of  a  target  by  comparing  the  likelihood 
ratio 


LR 


f.® 

yx) 


(A-6) 


to  a  fixed  threshold.  Here  f^(X)  is  the  conditional  probability  density 
function  of  the  received  samples  (over  all  hydrophones  and  over  all  fre¬ 
quencies)  when  signal  is  assumed  to  be  present;  similarly  f ^ (50  is  the 

conditional  probability  density  function  when  signal  is  assumed  to  be  absent. 
* 

Since  X(-n)  «*  X  (n)  ,  and  X(n)  and  X(m)  are  independent  for  n  f  a  ,  Eq. 


(A-6)  can  be  written  as 

u;  T 

o 

LR 


f  (X(n)j 
s  — 


n=l  f^[X(n) ] 

Now,  define  the  signal  and  noise  covariance  matrices  at  each  frequency  bv 


(A-7) 


P(n) 

Q(n) 


<  X  (n)  XT(n)  > 


<  X* (n)  XT(n) 


(A-8) 

(A-9) 


B-150 


where  the  superscript  T  refers  to  transposition  and  the  symbol  <  >  means 

♦ 

ensemble  average  subject  to  the  noise-only  hypothesis.  Then  the  conditional 
probability  density  functions  appearing  in  Iq,  (A-7)  can  be  expressed  as 


fN  [X(n)  ]  -  CN  exp  [-  X*T(n)  ^(n)  X(n)] 

fa  [X<n)  1  -  exp  f-  X*T(n)  {P(n>  +  Q(n)}"1  X(n>] 


so  that  the  likelihood  ratio  is 


i  T 

0  *<r  1  i 

nL  c^Sr~  exp  (n)  {Q-  (n)  "  [-(n>  +  s-(o)1  }£(n)1 


(A-10) 

(A-ll) 

(A-12) 


where  the  c's  are  the  normalizing  constant  of  the  Gaussian  distribution.  We 
further  assume  that  the  signal  originates  from  a  source  sufficiently  remote 
from  the  array  so  that  the  wavefront  is  plane  as  it  approaches  the  receiver. 
Referring  to  Fig.  1,  we  have 

F/n)  •  <J>d(n)  a.(n)  a*T(n)  (A-13) 

where  4>d(n)  ig  the  signal  spectral  density  at  frequency  n  , 

2imx  h 

ai(n)  »  exp  [  j  — ],  and  is  the  delay  at  the  i  hydrophone.  Since 

P(n)  is  now  of  rank  1,  the  inversion  of  the  second  term  in  the  brackets  can 

be  written  t7] 


Using  Eq.  (A- 14) ,  one  finds  that  the  logarithm  of  the  likelihood  ratio  is 
given  by 

0JoT  X*T(n)  a-1(n)  P(n) 

log  LR  -  C  +  E.  - t~. - ^ -  (A-15) 

n"  1  +  4>d(«)  .a  (n)  £  (n)  a(n) 


B-151 


where 


u  T 
o 


c  -  lo*  uSl  Cs^(n)/CN(n) 


T  * 

Since  £  (n)  £  (n)  ,  the  quadratic  fona  appearing  in  Eq.  (A-15)  can  be 

written  in  the  form 

*T  -1  -1  *T  -1  *T  -1 

*  a  ia  s  -  x  s.  *d£*  a  x 


■T'  #_1  *  *  T 

Cx  2  §  )  (£  a  )  X 


„T  *-l  * 


X  *-l  *x 


*  (xj  Q  a  )  (?  a  »  )  "  I?  1 


„T  *-l  *,2 


and  therefore 


oj  T 
o 


log  LR  »  C  +  I  |X  (n)  H(n)|  G^(n) 


2  2 


where 

H^n)  -  £  ^"(n)  a  (n) 

G^(n)  ■  ^d/[  1  +  4>d(n)  a  T(n)  £  ^(n)  a^(n)] 

H  (w)  and  G,  (ui)  are  the  optimum  individual  and  post-summation  filters, 

~~o  L 

tively,  as  referred  in  Fig.  3. 

If  we  let  n  =  a-'n  ,  ,  the  summation  appearing  in  Eq.  ( 

can  be  transformed  into  an  integral  for  large  T 

ui  X 

O  T  2 

l  jx  (u  ) 1  H(u;)  C 

n“l  —  n  —  L 

2 

da) 


-  2ttT 


!xA(m)  H(u>)  G  (w)! 


*0C 

!  K  i 2 

!  !  L.  GT  (oj)  HXa-)  X,(m>  dai 

j  i *1  L  l  i 


T 

K  2 

(F  rn ;  (.)  H.(w)  X.(u;>]}  dt 
».*1  L  x-  x, 

J  o 


(A-16) 

(A-17) 

(A- 18) 
(A- 19) 
respec- 

..17) 


(A-20) 


B-152 


The  last  expression  is  obtained  by  invoking  Pars aval's  theory.  Eq.  (A. 20) 

and  hence  Eq.  (A. 17)  can  be  implemented  In  a  form  represented  either  by  Fig. 

A-la  or  by  Fig.  A- lb.  These  two  structures  are  equivalent,  but  the  latter  is 

drawn  here  for  future  comparison.  4  (us)  Is  lust  the  noise  matrix  Q(u)  and 

— nn 

used  here  to  make  the  nomenclature  consistent  with  our  previous  developments. 

We  shall  now  consider  other  performance  criteria  for  the  array  system. 
Referring  to  the  general  array  configuration  Fig.  3,  and  combining  the  post- 
summation  and  the  individual  filters  to  make  G(u)  -  1  ,  we  Bee  from  Sect.  5.2 
that  the  signal-to-noise  ratio  at  the  detector  output  is 


SNP. 


1 

2 


I 

IT 


4 

— ss 


H*]dui 


£ 


4 

—XX 


H*  ]2du>] 


1/2 


N 

D 


(A-21) 


Assume  that  H^ui)  maximizes  (A. 21),  and  let 
H(m)  =  H^Cuj)  +  eh_(w) 


(A-22) 


where  h(u)  is  an  arbitrary  vector  function  of  u>  . 
have  a  maximum  at  e  *  0  if  is  in  fact  optimum. 


r 


Eq.  (A-21)  must  now 
That  is. 


dSNR 

de 


0 


1 

IT 


h  +  h 


4 

—SB 


I'2  4.  -  ^  4  d“ 


*T  [c2 .  -  <  *■ 


(A-23) 


B-153 


Fig.  A-l  Two  Equivalent  Structures  of  Likelihood  Ratio  Detector. 


where  are  the  optimum  value  of  denominator  and  numerator,  respec¬ 

tively,  of  Eq.  (A-21) 

2  ^ 


is  actually  arbitrary  since  any  constant  multiplier  in  Eq.  (A-21)  cannot  effect 
the  condition  for  a  maximum.  The  two  integrals  of  Eq.  (A- 23)  are  equivalent  by 
virtue  of  the  fact  that  the  spectral  matrix  is  Hermitian 


*T 

4  m  $ 


and  [24] 

/- 

I 

—CO 

Hence 


T  * 

H*  4  H£  du> 


T  * 

H2  4  Hr  do» 


HT[c2  4  -  4  H*  hI  4  lh*du>  -  0 

—  ~s  s  —xx  — M  -xx 


(A-24) 


(A-25) 


(A-26) 


Every  component  of  h  is  arbitrary,  however,  and  Eq.  (A-26)  will  not  be 

* 

itisfied  for  all  h  unless 


£  l°2  4s  -  *xx  i  &  5..1  -  0 

Taking  c  ■  1  and  using  Eq.  (2.1-2),  i.e., 


(A-27) 


4  -  4.  a  a 

-38  d - 


Eq.  (A-27)  reduces  to 

“  [  4x  ]_1  ^  * 


-  —  r  *  ,-l  * 

»d  2  [  *x*  1  *4* 


(A-28) 


-1 

* 

*  * 

[ 

$>  ] 

♦  .  a  ■ 

[4  +  <J)j  a 

-xx 

d  — 

— nn  d  — 

*-l 

*  T  *-l 

4,  4 

a  a  4 

f 

A*-1 

d  — nn 

-  -  -nn  . 

L 

<f  - 

-tin 

1  X 

T  *  *  * 

*-l  * 

"  ♦«,  ® 
— nn  — 


^nn  - 
*d 

T  *-l  * 

1  +  *d  5-  inn  i 


(A- 29) 


B-155 


B-156 


Thus,  the  optimum  Individual  filter  for  our  signal  model  is 

H^w)  «  <£”*(«)  a* (o>)  (A-38) 

and  la  invariant  tinder  changes  of  optimization  criteria.  Only  the  optimum 
post-summation  filter  G(u)  needs  to  be  modified  according  to 

G«  •  °L  "  °M*/  <A-3W 


B-157 


Squaring  Eq.  (B-5) 


<%+i  -VT  (£*«  'V  '  ^p>r  (%  "V 

-  -s^T  it 

2  T 

+  Yj  V  Q  V  Q 


and  taking  the  conditional  mathematical  expectation  for  given  ,  c^, . . »»cj, 


we  obtain 


E{ I  I c. , .  -  c  J 

i  i-=j+i  -op1 


|c.  -  c  | |2  -  2y  (c^  -  c  )T  E{7  Q> 
j  —op'  '  'j  -=j  —op  — 


+  Yj  E{7^  Q  7  Q) 


(B-6) 


From  condition  (C),  Eq.  (B-6)‘  becomes 


l-^j+i  *  ■Sop  1 1 2 1  — i *  *  *  *  % 


2l£r...  £j>  5  I  l-Sj  “  -^opl  I2  -  -£op 


)T  7  Q} 


+  y?  d  (cT  c  +  cT  c.) 
j  -op  “OP  “j  ~j 


Using  condition  (B) ,  Eq.  (B-7)  i6  reduced  to 


e{||£j+1  ^opl  1 2 1  — i » '  ”’■%} 

<  1 1  c_)  *  ^opl  i  2  U  +  Y^  d)  +  2y2.j  d  cT 


-3  'H  -op  k-j 


(B-7) 


+  ,  t.  2&y?  cT  c  ZtJ_,  (1  +  Y2  d) 
k-j  k  —  —op  m-k+1  m 


(B-9) 


Then  Z^+1  ~  |  |Cj+1  -£^11  k£j+1 


H2  .!m  (1  +  ’J  « 


4*  .  I  -  2dy,2  c  c  ft,  (1  +  Y2  d) 
k-i+1  k  —  —op  m“k+l  m 


(B-10) 


B-159 


Taking  the  conditional  ma thematical  expection  for  given  * 

we  have 

CO 

"  E{ilVl  '  •£opl!2|£-l*"'f)}k2j+ia  +  Yk  d> 

+  kSj+1  2dYk  *  -^op  m»k+l  (1  +  Y»  d) 

i  I  "  Sop!  ! 2  +  «*Yj>  +  2Yj  d£T  kEj+l^1  +  Y*  d) 

CO  Q9 

+  . I.  2dY£  cT  C  n.  ...  (1  +  Y2  d) 
k“j+l  k  —  —op  m-k+1  m 


h 


or  l-S-i » ‘  •  •  »£j  ^  £  iLj 

Next,  taking  the  conditional  mathematical  expectation  for  given  Z^t ... ,Z^ 
on  both  sides  of  Eq.  (B-Il) ,  we  have 


(B-U) 


}  £2, 


(B-lla) 


Since  2^  -  f  (cx,  £2 . c^)  . 

Inequality  (B-lla)  shows  that  Z,  is  a  semimartingale,  where 

—J 

EZj+iiEZj  <  ...  iEZ^- 


(B-12) 


so  that,  according  to  the  theory  of  setnimartingales  the  sequence  con¬ 
verges  with  probability  one,  and  hence  by  virtue  of  Eqs.  (B-lb)  and  (B-lc) 
the  sequence  (c^  -  c^)  also  converges  with  probability  one  to  some  random 
number  £  .  It  remains  to  show  that  P(£  »  0)  ■  1.  It  is  seen  that  from 
Eqs.  (B-12),  (B— 9 )  and  (B-lc)  the  sequence  E(£_^  ~  c^p)  is  bounded.  Now 
taking  the  mathematical  expectation  on  both  sides  of  the  inequality  (B-7) 


Doob,  J.  L.,  Stochastic  Processes,  John  Wiley  and  Sons,  N.  Y.,1953 


B-1S0 


-  i0!,llJ)..i«llsJ  -Sopll2!  -  *1}  -iop)T  7  1’ 

+  Y’j  1[£opT  io,  *  '<%’  %>> 

and  adding  the  first  j  inequalities  together,  we  have  by  deduction 

B{l  i^j+1  -  £opl  12}  i  E{li%  -  £opl  i2  +  Jl  <d£opT  iopYk  +  dYk  B<%  £k)} 


-  kSi  2\  E{<%  -  V  7  « 


(B-13) 


Since  E{|i.£j  -  c^l  j2)  ia  bounded  and  condition  (B-lc)  is  fulfilled,  from 
Eq.  (B-13)  It  follows  that 


kSi  Yk  E{(CJ  -  cop)  7  Q}  * 


(B-14) 


Using  condition  (B-lb) ,  i.e.,  Yj  "  “  and  notl-n8  (B-2) 


inf 

E  ^  li£-^opl 


E{(c  -  c  )  7  Q>  >  0 

i  ~  -op 

<  — 
e 


We  deduce  from  Eq,  (B-14)  that 
T 

E{ (c,  -  c  )  V  Q>  +  0  with  probability  one  for  some  sequence  N  .  (b-15) 
— j  -op 

Now,  taking  E{||c^  -  c^ |  | 2 )  -*•  with  probability  one,  and  comparing 
Eq,  (B-15)  with  Eq.  (B-2)  we  obtain 


-S  **  0  with  probability  one. 

Therefore,  algorithm  (B-4)  converges  with  probability  one 


(B-16) 


p  £  <£1  -  V  • 0  -1 

as  well  as  in  mean  square  sense,  i.e.. 


(B-17) 


lim  E{  1 1 Cj  -  c^l  |21  -  0 


(B-18) 


B-161 


APPENDIX  C 

Some  properties  of  Gamma  functions 

Since  P(a  +  n)  -  (a  +  n  -  1)  r(o  +  n  -  1) 

■  (a  +  n  -  1)  (a  +  n  -  2)  T(a  +  n  -  2) 


■  (a  +  n  -  1)  (a  +  n  -  2)  -  ar(a) 


He  have 
n 

(a  +  k  -  1)  »  a(a  +  1}  - —  (a  +■  n  -  1) 


r(a  +  n) 
r(«) 


Thus 


-1  (1  ~  j+i5  "  “  x)  r(n-2  -  x) 

<j+Dt  “  (j+i)j  r(2  -  x) 


k-1 


Eq.  (C-2)  can  be  approximated  by  using  the  formula 

r 

—  i  11  139 

.  -x  x-  2  .2  -  1  +  ~  +  — 

r(x)  *  e  x  (2x)  .  12x  „„„_2 


288x  51840x 


JZi-7  +  0  <i» 


2488320  x 

1  l 

«•  e  x  x  x  2  (2n)2  for  x  »  1 
From  Eq.  (C-3)  we  can  write  for  j  >>  1  , 

r(J  +  2  -  a)  =  e-<J+2"«)  (j+2-a)  l+2~a~  2  (2tt)2 

3  1 

-o 


s-<j+2  a  (j+2_a)  j+  2  (j+2_a)-°  (2n) 


(j+l)l  -  T(j+2)  =  e  (3+2)  (j+2)J+  2  (2tt)  2 
Since 

j+2-a~j+2  if  j  »  a 


i 


<C-1) 


(C-2) 


(C-3) 


(C-4) 


(C-5) 


* 


Whittaker  and  Waston,  Modem  Analyais ,  p.  253 


B-162 


we  obtain  from  Eqs.  (C-4)  and  (C-5) 


literal.  WJtsl,  «  (1+2-a) ”a 
<j+i)t  r(j+2)  a) 


if  j  »  1  and  j  »  a 


(j+lV 

Therefore,  combining  Eqs.  (C-2)  and  (C-6)  gives 

1 


k**l  (1  “  J+l5 


r<2-x)(j+i) 


and  furthermore. 


n 

I_(l 


j-m 


j+1 


)  s 


(ih-i  y 


(C-6) 


(C-7) 


(C-8) 


B-163 


APPENDIX  D 


EFFECT  OF  UNCERTAIN  SIGNAL  POWER  ON  THE  PINAL 
VALUES  OF  THE  GAINS 


To  Illustrate  the  essential  steps  Involved  In  studying  the  convergence 
properties  of  algorithm  (3.5-1),  we  shall  consider  only  the  single  gain  case. 
The  corresponding  extensions  to  multiple  gain  case  is  straightforward  but 
laborious  due  to  matrix  manipulations.  Examples  have  been  presented  in  Sect. 
3.5. 

a 

Let  the  assumed  signal  correlation  function  R  and  the  actual  signal 

s 

correlation  function  be  related  by  a  multiplicative  constant  Gg 


R  -  G  R 

S  8  S 


The  single  gain  version  of  C3.5-1)  ie 


Vl  "  c3  *  2t3  '<1.  ‘  lrS  *j  "j 


-  C  +  2y.  G  R  -  2y.  c  x 
3  3  8  J  J  j 


(D-l) 


(D-2) 


since  z(t)  *  cn(t)  »■  cx(t)  in  this  simple  case. 
The  optimum  gain  ia  known  to  be 


e  -  c  -  (R  +  R  )-1  R  «  X2 
op  a  n  a 

The  average  of  Eq.  (D-2)  is  then 


(1  -  2Yj  x2)  c  +  2Yj 


g  x2  e 
s 


-  ii <i  -  »2> 

+  Gs  0  k^l  ZTk  *  t-£+l  fl  2^l  x  ) 


If  we  set 


Y 


J 


1 

2(j+l)x5 


(D-3) 


(D-4) 


(D-5) 


B-164 


# 


Eq.  (D-4)  becomes 


Cj+1  "  J+1  +  GS0  jil 


(D-6) 


which  shows  that  the  mean  of  converges  to  Gs6  at  the  rate  of  j 

(c,  -  Gne) 

-*■  0  as  j-**> 


-1 


(CJ+1  '  Gs6>" 


j+1 


(D-7) 


We  shall  consider  c* 

Squaring  (D-2) 

-2  „  -  .2,2  2  .  ,  2  „2  _2 


J+1 


■ (i  -  -?  ■$  ♦  v <  < 


+  ‘V1  -  V  c3  S.  *. 


and  taking  the  average  yield 

~2 


“J+1 


If  we  let 


-  (1  -  4yj  x  +  4yj  x  )  Cj 
+  +  «•  -  0,  *. 


(T)-8) 


,  2.24 

4y j  x  -  4yj  x  -  Vj 


(D-9 


4yJ2  Gs  Rs  +  *Yj(1  "  2Yj  x2)  Gs  Rs  Cj  *  Uj 


(D-10) 


Eq.  (D-8)  reduces  to  a  simpler  form 


Vi  • (1  -  V  c]  +  “j 


j 


j 


“  C1  k“l  (1  ~  Vj}  +  k=l  uj  tSfcfl  "  vj* 
From  Eqs.  (D-5)  and  (D-9)  we  can  write 


(D-ll) 


(1  -  vk)  -  1  -  (4Yk  x2  -  4vk2  xS  -  (1  -  j£m  -  jpf) 


(D-12) 


B-165 


where 


1  -  •'  I  -  a 

1  +  A  -  a 


4.  2 
a  ■  x  /x 


Two  approximations  for  the  Gamma  function  will  be  used 

l  (1  _  -A.)  ,  HltXzJiL. 

-i  v  k+l/  (  J.+l  )  !  T  f  2-A 


k-i 


k+l'  ,  (j+l)!r(2-A)  *  r(2-A)(j+ir 


n  A  A„  2 

vJL  ~  k+T>  <l  -  T3T>  3  ~ 


(n+1) 

Using  Kqs.  (D-12)  to  (D-17),  Eq.  (D-ll)  becomes 


r  a  r 

j+1  1 


r(2-A1M(2-A2)(j+l)‘ 


+  k=l  :1k 


(k±ir 


(j+1) 

But  from  Eqs.  (D-io)  anU  'b-5) 


u.  =  4>,  * 
K  k 


R  * 

£ 


r .  ( 1  -  2y.  x  )  G  R  c 
k  k  s  s  k 


■> 

t;  ‘  r 


(k+D“  >r 


+  4  -  V,  Ti\  G  R  c 

—  (k+1)  s  s  k 

2  (  k-*- 1  )  X  “ 


_ 1_ 

(k+1)' 


G  R 


~> 

(  x“> 


k  G  R  c. 
s  s  k 


Substituting  Eqs.  (D-3)  am!  (D-M  into  Eq .  (D-19)  gives 


(k+1) ' 


Gs2  °2  kfe:  +  2Gs6  ci  k+ij 


(D-13) 

(D-14) 

(D-15) 

(D-16) 

(D-17) 

(0-18) 


(D-19) 

(D-20) 


K-1B6 


and  Eq.  (D-18)  becomes 


J+1  r(2-A1)r(2-A2)(j+D2 

+  -1 


(j+1) 


n  '■* 

2  8 


2  2  **  ®2  J  k2 

c  e  + 


(j+D 


2  k-l  k+1 


2Gfi9  c  J  fc 
+  — - — 


(j+1) 


2  k-l  k+1 


For  large  j  ,  we  may  write 


l 

k-l  k+1 


k^lk 


1L 1±1L 


J  k  ^ 

k-l  k+1  "  k-l  1  “  J 

Eq.  (D-21)  then  becomes 
„  2 


“j+1  -  r(2-A1)r(2-A2)(j+i) 

+  — i 


2  2  2  2  1 
~  g  e  +  g  e  +  2G  e  c.  — 0 

2  s  S  s  1  ,..,n2 


(j+1)  °  X  (j+1) 

The  error  variance  in  the  parameter  apace  is 

(Cj+1  -  6)  -  c^+1  -  26  c^+1  -  6 

which  by  utilising  Eqa.  (D-24)  and  (D-6)  reduces  to 

2 

- ^  c, 

.2 


Cj+1  -  6> 


r(2-A1)r(2-A2) (j+1)' 


(D-21) 


(D-22) 


(D-23) 


(D-24) 


(D-25) 


+  6 


+  2cx  6 


(J+1) 
1 


,  [J  (G.  -  1)*  +  j  (3G  ^  -  2G  +  2)  +  G  +  1) ] 

+  S  8  3 


(j+i)' 


[j  (G  -  1)  -1] 


(D-26) 


B-167 


Af 


APPENDIX  E 


CEHERAL  DYNAMIC  METHODS  OF  STOCHASTIC  APPROXIMATION 

Recently  Dupac  [36}  has  proposed  a  dynamic  stochastic  approximation 
method  for  the  particular  case  where  the  optimum  of  a  production  process 
moves  linearly  during  the  optimization  period.  This  method  essentially 
consists  of  a  two-step,  approximation  procedure  to  be  performed  at  each 
stage  of  the  optimization  process.  The  first  step  is  designed  to  correct 
the  time-varying  trend  of  the  parameters  being  estimated;  the  second  step 
is  made  by  means  of  an  ordinary  stochastic  approximation  procedure,  based 
on  the  observation  of  a  new  sample.  In  [36],  the  parameters  were  assumed 
to  vary  linearly  and  the  convergent  conditions  remain  essentially  unchanged 
from  those  of  the  stationary  case. 

Here  we  generalize  the  above  method  to  include  any  nonlinear  and  cou¬ 
pling  variations.  Convergent  conditions  are  modified  accordingly  and  the 
proposed  method  Is  shown  to  reduce  to  Dupac’ s  scheme  as  a  special  case. 

a)  Pupae's  Method 

Consider  the  problem  of  finding  the  extrema  of  functions  of  several 
variables 

I  -  Q  <x|c)  (E-l) 

where  x  *»  {x^,  X2****»xn^  is  a  vector  of  random  processes  with  distribu¬ 
tion  P(x)  and  c  *  {c_,  c.,...,c  )  is  a  vector  of  parameters  to  be  adjusted. 

l  z  n 

When  P(x)  is  unknown,  an  algorithm  derived  from  the  method  of  stochastic 
approximation  to  obtain  the  set  *  9^  is 

^j+1  “  (E-2> 

whose  properties  have  been  derived  in  Section  3.2  and  Appendix  B.  It  is 
assumed  that  is  time-invariant,  l.e.,  6^  ■  ©_  for  all  j  . 

When  the  random  environment  is  non-s tat ionary ,  the  optimum  set  a 


B-169 


becomes  a  function  of  time  index  j  .  Ita  value  at  time  j  will  be  denoted 
by  0.  .  Dupac  [36]  has  considered  the  case  where  is  linearly  (in  his 

sense)  time-varying. 

-  %  «  +  J>  -  o  <4->  M 

where  m  >  a  ,u  being  related  to  y,  by  y,  »  0  {——)  . 

J  J  3* 

Dupac 's  method  is  to  estimate  the  unknown  parameters  c  by 

Sj+l  “  CjU  +  j)  ~  (E-4) 

Algorithm  (E-4)  can  be  shown  [36]  to  converge  with  probability  one 


P  )  lie  (c  -  y  )  -  q\  m  l 

(  ^  "j  ~J  ] 

as  well  as  in  mean  square 


^’|2) 


lia  E  J  j ic,  -  _0 


under  tnc  following  conditions 

1 


(A)  > 


<  a  <  1  ,  y  >  0 


(E-5) 


(E-6) 


(E-7) 


(B)  There  exir.t  constants  K.  and  K  , 

*  u 


0  <  K.  <  K  <  ™  >  such  that 
x  u 


-J  V><W- 


(E-9) 


(C)  For  all  values  of  o  , 

Var  [VTq  VQ ]  e  a2  <  (E-9) 

where  VQ  i  7  Q(xic)  for  simplicity. 

~  c  _ 

Condition  (B)  and  (C)  are  equivalent  to  conditions  (3.2-14)  and  (3,2-15). 
Condition  (B)  simply  says  that  VQ  must  lie  between  two  planes,  one  of  posi¬ 
tive  slope  and  the  other  of  finite-positive  slope.  This  condition  in  one- 


B-170 


dimensional  case  is  illustrated  below 


f 


/\ 


VQ 


line  of  finite-positive 
slope 


/ 


VQ 


Condition  (C)  says  that  the  variance  of  {VTQVQ}  iB  finite  while  Eq. 
(3.2-15)  says  that  its  expected  value  rather  than  its  variance  is  finite. 
The  conditions  Imposed  here  on  the  behavior  of  VQ  are  somewhat  stronger 
than  those  for  the  ordinary  methods  of  stochastic  approximation. 


b)  The  General  Method 

In  this  section  we  shall  relax  the  restriction  that  the  parameters  vary 
linearly  during  the  adaptation  period.  It  is  assumed  that  the  law  governing 
the  variation  is  known,  although  the  sequence  to  be  estimated  is  unknown. 
Theorem  3:  Let  the  variation  of  0^  be  governed  by  a  known  operator  such  that 


Vi“L 

(E-10) 

Define  the  following  quantities 

G,  -  G(0  ,)  -  (grad  L,)1 

~3 - j»j  j 

(E-ll) 

X  -  sup  {eigenvalues  of  G.) 
j  all  &  ”3 

(E-12) 

-  Qj  -  E{Q(xj|cj)} 

CS-13) 

EtV^Qj  |  c^,  ^2»**r*  -%-l^  "  Ej 

(E-14) 

B-171 


(E-1S) 


tl,'\  '=Q1!%’  %-l>  i  ®J  +  dj  “2 


V  -  -X- 

j  Ja 


(E-16) 


3  j6 


(E-17) 


(E-18) 


Then  ttie  algorithm 


(E-19) 


converges  in  the  sense  of  Eqs.  (E-5)  to  (E-6)  under  the  following  conditions 


A.  -2“6<a<_l  +  B 

0  <  6 


0  <  5  <  - 


(E-20) 


B.  There  exist  constants  K.  and  K  ,  0  <  K.  <  K  <"  ,  such  that 

i  u  Z  u 


k.| Ic,  -  0.1 12  <  <c.  -  i,)T  v  Q.  <  K  1 1 c.  -  6. M 2 

Z 1  ~j  “j  -  ^  -j  -  u‘  '-=3  -j" 


(E-21) 


for  j-1,  2, . . . 

C.  For  all  values  of  c 


Var  [V  Q  VcQ]  <  e*  < 


(E-22) 


Note  that  conditions  (F,-18)  and  (E-19)  are  just  condition#  (E-8)  and  (E-9), 
but  Eq .  (F.-7)  is  replaced  by  Eq ,  (E-20)  to  take  account  of  the  time-varying 
effect . 


Proof: 


From  Eqs.  (E-10)  md  (E-19)  the  estimation  error  equation  can  be  written 


in  the  form 


Vi  •  Vi  ■  Vi  •  -  1%.J>  -  Vc^ 


%  h  '  hVj 


(E-23) 


B-172 


where  ,  defined  by  Eq.  (E-ll),  la  a  nonalngular  matrix. 

Take  the  inner  product  of  Eq.  (E-23) 

T  T  T  T  T  2 

■^+1  -j4i  "  rl  %  %  “  2y}  %  vcQj  +  Yj  VcQJ 


(E-24) 


Mote  that  is  a  function  of  x^,j~l,2,...,  and  hence  a  random  variable. 

Taking  the  conditional  mathematical  expectations  of  Eq.  (E-24)  and  using  the 
definitions  Eqs.  (E-12)  to  (E-15) ,  we  obtain 

E{~j+1  ^J+ll-1’  —2* '  *  *  ’^j-l^ 

*  Xj  E{^j  —2’  *  *  *’  ^j-l} 


-  .  T  ,  2  r  T  .  .2  2  * 

■  2yj  'i  %  +  i  Is)  =i  +  dj  °  1 

From  Eq.  (E-21)  we  can  write 


(E-25) 


«Sj  1 j  ^K^IIejH2 


Sj  2j  $  Kul  l*jl  I2 


(E-26) 

(E-27) 


and  thus  Eq.  (E-25)  becomes 

V  <  X2  V  -  2y  X  K  V  +  y2  K  V  +  y2  d2 

J+l  -  J  j  Yj  J  Vj  +  YJ  V j  Yj 


*5  a  ‘  2vj  kj  + 1  Vvj  +  7d  dj  °z 


(E-28) 


where 

Vj  -  E{e^  e^lc^t  c2»***» 

kj  ■  vj1 
KJ  -  vl2 

Since  the  sequence  y^  is  monotonic  decreaslngi  there  is  J  such  that 


(E-29) 

(E-30) 

(E-31) 


(1  -  2yjkj  +  yjKj)  <  [l-(2-<  ^y  j  ] 


(E-32) 


B-173 


and 


[1  -  (2  -  ej)kjTjl  >  o 

0  <  Ej  <  2  for  j  >,  J 


Thus,  for  j  >  J  ,  inequality  Eq.  (E-28)  may  be  weakened  to 


Vi 1  xj[1’(2  -  Wj1  + 1  dj 32 


(2-33) 

(E-34) 

(E-35) 


The  term  in  the  bracket  is  positive  for  j  >  J  ,  and  we  may  start  with  j  -  J 
and  use  this  recursive  inequality  repeatedly  to  obtain 


s  Vj-i,  j-i  xlf°2  A  i d!  A  »i 


in  which 

B 


i, j  “  2-i+l  (1  "  yl  »  0  i  1  S  J 


(E-36) 


(E-37) 


and 


-  0 


l)  -  k](2  -  r  )  >  0 


.  i  >  3 


(E-38) 


By  taking  the  logarithm  of  both  sides  of  Eq,  (E-37)  and  using  the  inequality 


108  (1  -  W  1  -Vi 

one  may  show  that 

J 

B  .  <  exp  i  -  .7  y  E,  t 

i,J  ~  ‘  i  <«i+l  i  £  , 

Therefore,  it  is  necessary  'o  have 


V  v  r 

jSl  J  j 


(E-39) 


(E-40) 


(E-41) 


and 


j£i  XS <  50 


to  make  the  first  tern,  on  the  right-hand  side  of  Eq.  (E-36)  vanish 


BJ-l,j-l  =  ° 


(E-42] 


(E-43! 


B-J74 


Now,  let  us  consider  the  remaining  term  on  the  right-hand  side  of  Eq. (E-36) . 
Let  u(x)  denote  the  unit  step  function 


u(x) 


1  x  >,  0 

0  x  <  0 


Then  the  limit  of  the  term  in  question  can  be  written  as 
j-1  j-1 

£  °  t£j  i  “5  “£,3-!  JU+i 

-  T?  di  Yj-i  11  *  ■<*-*»  jsL  *1 

■  ”2  «Ij  yl  o»  b£,j-i  [l  -  “«-«)  !>m 


(E-44) 


(E-45) 


The  Interchange  of  limit  and  sum  Is  justified  provided  that  the  sum  la 
absolutely  convergent,  i.e., 

*Sj  yl  *1  <  •  f£-46> 

Combining  Eqe.  (E-36),  (E-43),  and  (E-45)  at  least  yields  the  desired  results 

(E-47) 


lim  Va  “  0 


-4-HJO  ^ 


if  Eqs.  (E-41) ,  (E-42) ,  and  (E-46)  are  satisfied. 

Since,  by  ratio  test,  the  series  j  a  diverges  for  a  ^  1  and 

converges  absolutely  for  a  >  1  ,  for  the  ordinary  stochastic  approximation 
we  require 

■j  <  a  <:  1 

so  that  |  Y.  -  08  and  y*  <  ®  if  y,  *  J  • 

J  1  J  J-l  j  j 

In  the  general  dynamic  case,  we  require  that 


ih  tj  -  -  .  jli  y]  dj  <  -  *  “d  jli  < 


B-175 


As  is  proportional  to  X^1  by  Eqs.  (E-30)  and  (E-38) ,  we  can  let 


-  0(jB) 


and 


Jj 


0(j“6) 


(E-48) 


(E-49) 


Eqs .  (E-41) ,  (E-42),  and  (E-46)  are  satisfied  if  Eq.  (E-20)  is  satisfied.  The 

constraints  given  by  Eq.  (E-20)  indicate  that  the  sequence  y,  -  — jjj-  cannot 

3  3 

be  chosen  arbitrarily.  If  the  optimum  set  varies  too  fast  (6  <  1/2),  the  pro¬ 
posed  algorithm  will  fail  to  track  it.  Actually  the  ordinary  stochastic  approxi¬ 
mation  method  converges  at  the  rate  of  0(  j  °) ,  the  tracking  operation  will 
definitely  fail  whenever  S  >  a  or  6  >  a  . 

In  [36],  it  is  assumed  that 


Vl 


W  +  t> 


Then 


-  (grad  1^ 


xj=  i  +  ro(_) 


(1  +  j) 


(1+J) 


(E-50) 


(1  +  y> 


(E-51) 

(E-52) 

(E-53) 


so  that  ~  <  a  ^  1  remains  unchanged  as  in  the  stationary  case. 

It  is  also  to  be  noted  that  although  y^  0  as  j  ■*  00  or  the  adjustments 
become  smaller  as  the  adaptation  process  proceeds,  the  parameters  to  be  esti¬ 
mated  still  vary  in  a  way  similar  to  the  variation  of  the  optimum  set  because 
c  -  L ( c,  )  for  j  -*■  “  . 

— j+1 


B-176 


APPENDIX  F 


SUMMARY  OF  THE  KALMAN  FILTERING  TECHNIQUES 

In  this  appendix  the  formulas  for  the  discrete  time  optimum-filter 
solution  are  given.  Detail  derivations  can  readily  be  found  in  the 
literature. 


Define 

the  following  nomenclatures: 

% 

:  system  state  vector  or  system  parameters 

h 

:  input  or  control  function 

h 

:  white  noise 

:  system  output  vector 

A 

:  system  dynamics  function 

G 

:  input  constraints  on  system  state 

H. 

:  constraints  on  observing  the  state  of  the 

system  from  the  system  output 

linear 

system  is  characterised  by  the  difference  equations 

Vl  " 

(F-l) 

(F-2) 

together  with  the  statistics: 


£{u,} 

~J 


E{v.}  ■  0  for  all  j 


E{ 


T, 
u. } 


'Sk  “kj 

E{uj  v£>  *  0 


The  optimum  filter  minimizing  the  performance  criterion 

<H.  £  -  DJT  rT1  (H.  £  -  DJ 
— j  3  *rJ  ~  j 


iMj  £ 


v;-i 

Rj 


(F-3) 

(F-4) 

(F-5) 

(F-6) 

(F-7) 


B-177 


is  described  by 


9, ,,  6 .  +  K  (d...  -  h...  A  EO 
rJ+1  — j  -j  j+1  -j+1 - j 


(F-8) 


where  d 


J+1 


the  (1+l)-st  column  of  D,., 

-j+1 


h_j+^  *  the  (j+l)-at  column  of 


And  ia  the  weighting  matrix  defined  by 

T  -1 
K .  -  P..  ,  h*  R , , , 

— J  — a^-t  +j+l  "4+1 

-l 


(^j  +  ^T) 


(F-9) 


+  tJ+1  -j+1  -j+1 


(F-10) 


-j+1 


is  the  outer  product  of  error  of  the  optimal  estimate 

^j+1  =  E  (%+l  ~  -j+P  (-j+l  “  l 


(F-ll) 


B-178 


APPENDIX  G 


DERIVATION  OP  EQS.  (4,3-32)  AND  (4.3-33) 
The  general  algorithm  has  been  derived  by  Eq.  (4.3-12): 


Hj+i  “  Hj  +  Lj+i  2j+i(dj+i  "  nj+i 


where 


r  -  —  p 

— j+i  $  — j+i 


“d  Z^h  “  %  +  o  I>_1  +  j  Hj+1  Vi 


In  the  stationary  case  q  •  0  ,  thus 

-1-11  T 

Pj+1  Pj  +  ♦  %+!  *3+1 


or 


rj+l  “  Fj  +  ^j+l  Sj+l 


which  has  a  recursive  solution 

-1-1  1  3  t 

r.x  =  r  +  j  <4  .  E.  n_  O 

j  o  j  k«l  -k  — k 

=  r”1  +  j  R  -  j  R 

o  J  -n  h 

(derived  in  Eq.  (4.3-23) 
Combining  Eqs.  (G-l)  and  (G-2)  gives 


2)+l  '  Sj  +  Fi«  Vl  <dj«  -  =i+l  V 


-1  -11  T 

let  pj+i  '  pj  +  <t>  4j+i  rJ+i 

be  the  weighting  matrix  for  the  stationary  case 
and 

P'  A  P_,  +  B 


-3+1  =  -J+i  “J+I 


(G-l) 

(G-2) 

(G-3) 


(G-4) 


(G-5) 


(G-6) 


(G-7) 


(G-8) 


B-179 


be  the  weighting  matrix  for  the  nonstationary  case  such  that 

<Fi+i -  Vi*'1  ■ (p) + < i*'1  *  ♦  Vi  Vi 

Since  from  Eq.  (G-6) 

Pj1  «  0(3) 

we  can  write  for  large  J 

*  tj+i  4+i "  <pi+i *  v* 


i  nj+1  aj+1  «  (P  +  <1 1)'1 


Thus,  Eq.  (G-  9 )  becomes 


<Vi  *  Virl  *  <  V I>_1 


ince  Pj  ~  0(j  X)  and  -*•  B  *  const.,  we  can  write 

Pj  <<  Bj  for  large  j 
or  from  Eq. 

B  =>  q 

’•'hich  is  Eq.  (4.3-32). 

Returning  to  Eq.  (G-7)  since*  for  large  j 


Pj+1  “  Pj  +  Bj  =  Bj  *  6  =  ' 

the  corresponding  algorithm  becomes 


V  •  Sj  +  f  Vl  W3  -  4+1  V 

Comparing  with  the  ordinary  algorithm 


Vl  "  V  *3  Vl  CdJ  ~  V 


(G-9) 

(G-10) 

(G-ll) 

(G-12) 

(G-13) 

(G-14) 

(G-15) 

(0-16) 

(g-17) 


B-180 


such  Chat 


we  can  replace  y  by  y’ 

Yi  •  h  +  B  ■  h  *  f 


which  is  Eq.  (4.3-33)  which  has  B  ■  0  for  q  -  0 


(G-18) 


B-181 


REFERENCES 


1.  Wiener,  N.,  Ext rapoiat ion, ■ Interpolation.  and  Smoothing  of  Stationary 
Time  Series,  John  Wiley  and  Sons,  N.  Y.,  1949. 

2.  Kalman,  R.  E,,  and  Bucy,  R.  S.,  "New  Results  in  Linear  Filtering  and 
Prediction  Theory,"  J.  Baade  Engineering,  Vol.  83D,  1961. 

3.  Faran,  J.  J.  and  Hills,  R.,,  "Wide-Band  Directivity  of  Receiving  Arrays," 

Harvard  Univ.  Acouat,  Res.  Lab,  Tech,  Memo.  No.  31,  Hay  1953. 

4.  rtcrmoz,  H.,  "Adaptive  Filtering  and  Optimal  Utilization  of  an  Antenna," 

Bureau  of  Ships  Translation  No.  903,  Oct.  1965. 

5.  Bury,  J.,  "Three-Dimensional  Filtering  with  an  Array  of  Seismometers," 
Georhysj eg .  vol.  29,  no.  5,  pp.  693-713,  Oct.  1964. 

6.  Bryn,  F, ,  "Optimum  Signal  Processing  of  Three-Dimens tonal  Arrays 
Operating  on  Gaussian  Signals  and  Noise,"  J.  Acousr.  Soc.  Am.  vol.  34, 
no.  3,  pp.  289-297,  March  1962. 

7.  Edelblute,  D.  J.,  Fisk,  J.  M.,  and  Kinnison,  G.  L. ,  "Criteria  for  Optimum- 
Signal-Detection  Theory  for  Arrays,"  J.  Acoust.  Soc,  Am.  vol.  41,  no.  1, 
pp.  199-205,  Jan.  1967. 

8.  Schultheiss,  P.  H.,  and  Tuteur,  F.  B.,  "Optimum  and  Suboptimum  Detection 

of  Directional  Gaussian  Signals  in  an  Isotropic  Gaussian  Noise  Field, 

Part  I  -  Livelihood  Ratio  and  Power  Detector,  Part  II,  Degradation  of 
Detectability  Dae  to  Clipping,  "IEEE  Trans,  on  Military  Electronics, 

pp.  197-211,  July-Qct .  1965. 

9.  Tuteur,  F.  B.,  "On  the  Detection  or  Transiting  Broadband  Targets  in  Noise 
of  Uncertain  Leva i,  "IKEE  Trans,  on  Comm.  Tech.,  vol.  Com-15,  no.  1, 

pp,  61-69,  Feb,  1967. 

10.  Tuteur,  F,  B-,  "Some  Aspects  of  the  Detectability  of  Broadband  Sonar 

Signals  by  Nondirac tional  Passive  Hydrophones,"  The  Rand  Corpora tion 
Rescarcn  Memo. _ km-457S-RRPR ,  Jane  1965. 

11.  Widrow,  B.,  "Adaptive  Sampled  Data  Systems,"  International  Federation  of 

Automat ic  Control,  'lose ow ,  I960. 

12.  Gabor,  D. ,  <  *,  al. ,  "A  UnJv-.-rg.il  Mon-linear  Predictor,  Filter,  and 
Simulator,  "Proc_.  __cf .  .T.EEF  O  endon)  ,  Part  B,  vc  .i  .  108.  July  1960. 

.13.  Buev,  R.  S .  md  r'ollin,  2.  W. ,  "Adaptive  Finite  Time  Filtering, " 

IRE  Trans,  on  Automatic  Control,  vol.  AC-7,  July  1962. 

■  -j  .  Saron-.ir  a,  K  •  S  .  and  McV  r  id...  L  .  ‘Mult  iparanete.r  Self -Optimizing  Systems 

Using  Correlation  Techniques , "  IEEF.  Trans. _ on  Automatic  Control,  Vol.  AC-9, 

Jan.  ‘964.  .  ’ . . .  ’ 

?.  Widrow,  B.  ,  "Adaptive  Filters  T -Fundament  ils,"  Tech.  Rept  ■  No.  6764-6, 
auford  Uni  vers  it-,  F>e<- .  1966  • 


B-1  82 


16.  Widrow,  B,,  et  al.,  "Adaptive  Antenna  Systems,"  Proc.  of  IEEE, 
vol.  55,  no.  12,  11.  2143-2158,  Dec.  1967. 

17.  Chang,  J.  H,  and  Tuteur,  F.  B.,  "Methods  of  StochaBtlc  Approximation 
Applied  to  the  Analysis  of  Adaptive  Delay  Line  Filters,"  Dunham 
Laboratory  Tech.  Rent.  CT-17.  Yale  University,  Nov.  1967. 

18.  Glaser,  E,  M. ,  "Signal  Detection;  Adaptive  Filters,"  IRE  Trans,  on 
Information  Theory,  vol.  IT-7,  pp.  87-98,  April  1961. 

19.  Jakowatz,  C.  V.,  Shuey,  R.  L. ,  and  White,  G.  M. ,  "Adaptive  Waveform 

Reception,"  Proc.  4th  London  Symp.  on  Information  Theory,  Butlerworths , 

1961.  ~~  "  ~  ~  "  -  -  -  -  - 

20.  Daly,  R.  F.,  "Adaptive  Binary  Detectors,"  Stanford  University  Elect. 

Lab.  TR  2Q03-2,  Stanford,  Calif.,  June  1961. 

21.  Scudder,  H.  J.,  "Adaptive  Communication  Receivers,"  IEEE  Trans,  on 
Information  Theory,  vol.  IT-11,  pp.  167-174,  April  1965. 

22.  Peterson,  W.  W. ,  Birdsall,  T.  G.,  and  Fox,  W.  C.,  "The  Theory  of  Signal 

Detectability,"  IRE  Trans,  on  Information  Theory,  vol.  IT-4,  Sept.  1954. 

23.  Cox,  H.,  "Array  Processing  Against  Interference,"  Unpublished  Report, 
Naval  Ship  Systems  Command,  Washington,  D.C.,  Oct.  1967. 

24.  Knapp,  C.  H.,  "Optimum  7, inear  Filtering  for  Multi- element  Arrays," 
General  Dynamics /Electric  Boat  Division  Technical  Report  U  417-66-031, 
Nov.  1966. 

25.  Schultheiss,  P.  M. ,  "Advanced  Topics  in  Linear  Systems,"  Lecture  Notes, 
Yale  University,  New  Haven,  Conn. 

26.  Churchill,  R.  V.,  Fourier  Series  and  Boundary  Value  Problems, 

McGraw  Hill  Book  Co.,  New  York,  1963. 

27.  Narendra,  K.  S.,  Gallman,  P.,  and  Chang,  J.  H.,  "Identification  of 
Nonlinear  Systems  Using  Gradient  and  Iterative  Techniques,"  Dunham 
Laboratory  Tech,  Rept.  CT-3,  Yale  University,  Aug.  1966. 

28.  Robbins,  H.  and  Monro,  S.,  "A  Stochastic  Approximation  Method,"  Annal.of 
Math.  Stat.,  vol.  21,  1951. 

29.  Kiefer,  J.  and  Wolfowity,  J.,  "Stochastic  Estimation  of  the  Maximum  of 
a  Regression  Function,"  Annal.of  Math.  Stat.,  vol.  23,  1952. 

30.  Blum,  J.  R. ,  "Multidimensional  Stochastic  Approximation  Methods," 

Annal.  of  Math.  Stat.,  vol.  25,  1954. 


31.  Dvoretzky,  A.,  "On  Stochastic  Approximation,"  Proc.  Third  Berkeley  Symp. 
of  Math.  Stat.  and  Prob.,  Univ.  of  California  Press,  1956. 


B-183 


32.  Schultheiss ,  P.  M. ,  "Passive  Detection  of  a  Sonar  Target  in  a 
Background  of  Ambient  Noise  and  Interference  from  a  Second  Target," 

General  Dynamica/Eiectrlc  Boat  Research  Progress  Reports  Ho.  17. 

Sept.  1964.  Also  appeared  in  JASA,  vol,  43,  no.  3,  1968. 

33.  Tuteur,  P.  B.,  "The  Effect  of  Noise  Anisotropy  on  Detectability  in  an 
Optimum  Array  Processor,  "General  Dynamics/  Electric  Boat  Research, 
Progress  Report  No.  33.  Sept.  1967. 

34.  Amari,  S.,  "A  Theory  of  Adaptive  Pattern  ClaaBifier."  IEEE  T™. .  on 
Electronic  Computers,  vol.  EC-16,  no.  3,  pp.  299-307,  June  1967. 

35.  Wozencroft,  J.  M.  and  Jacobs,  I.  M.,  Principlea  of  Communication 
Engineering,  John  Wiley,  New  York,  1965. 

36.  Dupac,  V.,  "a  Dynamic  Stochastic  Approximation  Method,"  Ann.  Math.  Stat., 
vol.  36,  pp,  1695-1702,  1965. 

37.  Abramson,  N.  and  Braverman,  D.,  "Learning  to  Recognize  Patterns  in  a 
Random  Environment,"  IRE  Trans.  Information  Theory,  vol.  IT-8,  pp.  s-58 
to  s-63.  Sept.  1962. 

38.  Sakrison,  D.  J.,  "Stochastic  Approximation:  A  Recursive  Method  for 
Solving  Regression  Problems,"  in  Advances  in  Communication  Systems, 
vol.  II,  edited  by  A.  V.  Balakrishman,  Academic  Press,  New  York,  1966. 

39.  Schultheiss,  P.  M. ,  "Likelihood  Ratio  Detection  of  Gaussian  Signals 
with  Noise  Varying  from  Element  to  Element  of  the  Receiving  Array," 

General  Dynamics /Electric  Boat  Research, Progress  Report  No.  10, 

January  1964. 

40.  McGrath,  R.  J.  and  Rideout,  V.  C,,  "A  Simulator  Study  of  a  Two-Parameter 
Adaptive  System,"  IEEE  Trans.,  AC-6,  pp.  35-42,  Feb.  1961. 

41.  Kesten,  H.,  "Accelerated  Stochastic  Approximation,"  Am.  Math,  Stat., 
vol.  29,  pp.  41-59,  1958. 

42.  Comer,  T.  R. ,  "Some  Stochastic  Approximation  Procedures  for  Use  in 
Process  Control,"  Amer.  Math.  Stat..  vol.  35,  no.  3,  1964. 

43.  Bryson, A  -E  •  ,  and  Frazier,  m  .,  "Smoothing  for  Linear  and  Nonlinear 
Dynamic  Systems,"  Proc.  Systems  Optimization  Conf.,0hlo,  September  1962. 

44.  Ho,  Y.  C.,  "On  the  Stochastic  Approximation  Method  and  Optimal  Filtering 
Theory,"  J.  of  Math,  Analysis  and  Applications,  vol.  6,  pp.  152-154,  1962. 

45.  Tuteur,  F.  B.,  "The  Optimum  Detector  for  Nonisotropic  Noise," 

General  Dynamics/Electric  Boat  Research,  Progress  Report  No,  38, 

Sept.  1968. 

46.  Lucky,  R.  W.,  "Automatic  Equalization  for  Digital  Communication," 

Bell  System  Technical  Journal,  vol.  44,  pp.  547-588,  April  1965. 

47.  Papoulis,  A.,  Probability,  Random  Variables,  and  Stochastic  Processes, 
McGraw-Hill  Book  Co.,  N.  Y . ,  1965. 


B-184 


aiOMTW!# 

Jl  v  o' I! 


et  veRIT^- 


OPTIMUM  PASSIVE  BEARING  ESTIMATION 
IN  A  SPATIALLY  COHERENT  NOISE  ENVIRONMENT 


Verne  H.  MacDonald 


Progress  Report  No.  40 
General  Dynamics/Electric  Boat  Research 
(8050-31-55001) 

September  1969 


DEPARTMENT  OF  ENGINEERING 
AND  APPLIED  SCIENCE 


YALE  UNIVERSITY 


I  Introduction 


Tb«  present  report  extends  the  analytical  techniques  outlined  in 

Progress  Report  Mo.  37  ("Optional  Passive  Bearing  Estimation  in  s 

Spatially  Incoherent  Noise  Environment , "  by  Verne  MacDonald  end 

Peter  M.  Scbultbeiss)  to  the  physical  situation  shown  in  Figure  1.  A 

linear  passive  hydrophone  array  is  used  to  estimate  the  bearing  0 

of  a  distant  target  in  the  presence  of  a  distant  interfering  acoustical , 

source  at  bearing  The  linear  array  contains  M  hydrophones  arbitrarily 

placed  at  points  UL,...,R  )  relative  to  an  arbitrary  origin.  Bearing  is 
x  m 

measured  relative  to  an  axis  perpendicular  to  the  array  axis.  The  target 
signal  wavefront  and  the  interference  wavefront  impinging  on  the  array  .■ 
are  assumed  to  be  eeeentially  planar;  this  assumption  Implies  that  the 
target  and  Interference  ranges  are  much  greater  than  the  array  length. 

We  obtain  a  theoretical  lower  bound  on  rms  bearing  estimation 
error  through  the  Cramdr-  Rao  inequality,  and  we  compare  this  bound 
with  the  rms  error  of  a  modified  split  beam  tracker  derived  in 
Progress  Report  No.  29. 

As  in  Progress  Report  No.  37,  we  neglect  any  inhomogeneities  of 
sound  velocity  or  attenuation  in  the  water.  Ambient  noise  is  assumed 
independent  from  hydrophone  to  hydrophone,  with  equal  power  and  identi¬ 
cal  spectra  at  all  hydrophones.  The  results  obtained  here  are  intended 
primarily  to  show  to  what  extent  a  distant  point-source  interference 
degrades  target  bearing  measurement  accuracy,  but  the  results  also  in¬ 
dicate  at  least  qualitatively 


the  effect  of  any  directional  (coherent)  'opponent  of  noiaa. 

Irrespective  of  how  this  component  arisen.  In  this  report,  the 
interference  bearing  is  assumed  to  be  known;  e  future  report  will  treat 
the  problem  of  an  Interference  of  unknown  bearing. 

The  procedure  for  obtaining  the  Craa£r-Rao  lower  hound  ia 
straightforward,  but  the  details  are  extremely  laborious.  For  that 
reason,  only  selected  intermediate  results  are  presented  in  this  report. 
A  cumbersome  result  is  obtained  for  arbitrary  signal,  interference , 
and  noise  spectra  and  hydrophone  spacing.  This  result  is  made  some¬ 
what  more  manageable  by  the  assumptions: 

1.  ambient  noise  power  much  greater  than  signal  power. 

2.  signal,  interference,  and  noise  spectra  of  identical  fora. 

3.  uniform  hydrophone  spacing. 

II  Sketch  of  the  Procedure  for  Finding  a  Theoretical  Lower  Bound 
on  Mean  Square  Bearing  Estimation  Error 

Let  p(x|e,<t>)  be  the  joint  probability  density  of  the  hydrophone 
data  x  for  specified  target  bearing  0  and  interference  bearing  <J>.  The 

Cram&r-Rao  inequality  places  the  following  lower  bound  on  the  variance 

-  1 
of  any  estimate  9  of  0  based  on  x_. 

(l  +  db/de)2 

(i)  (e-  e)  >  — j — • - 

-3  log  p(x  i  © ,  4>) 

302 

Here  b  ia  bias,  and  cverbars  denote  averages  with  respect  to  the  dis¬ 
tribution  of  Jt. 

The  data  X  take  any  legitimate  form,  such  as  time  samples  or 
expansion  coefficients  of  the  waveforms  produced  by  the  hydrophones.  For 
analytical  convenience,  we  let  x_  be  the  vector  of  the  complex 


C-2 


coefficients  of  exponential  Fourier  series  expansions  of  ell  hydrophone 
output  waveforms  over  a  time  interval  (-T/2,  T/2).  If  the  output  of  the 
ith  hydrophone  is  represented  by  ft<t),  then  the  Fourier  coefficients 
take  the  form 


T/2 


(2) 


W-  | 


£t(t)e 


dt 


<i-l . **) 


-T/2 


(3)  oik  -  wo  +  2*k/T  <k«l . n) 

We  will  take  uQ ^ the  lower  limit  on  the  processed  bandwidth^  to  be  0, 
but  It  need  not  be.  Our  date  are  then  arrayed  in  the  Mn  -  dimensional 
vector 

(4)  F  -  F^),...,  ^(u1>;...,  F1(un) . ^  (uj . 


If  s(t)  and  i(t)  represent  the  signal  and  interference  components 
respectively  of  the  waveform  at  the  array  origin,  end  if  o±(t)  re¬ 
presents  the  ambient  noise  component  at  the  ith  hydrophone,  then  the 
time  waveform  at  the  output  of  the  1th  hydrophone  is 

(5)  f^t)  -  sft-A^  +  i(t-e±)  +  n^(t) ,  with 

(6)  t>L  -  (r±/c)  sin  6  fil  "  <rj/c>  sin  <*» 

where  c  is  sound  velocity.  We  assume  that  s(t),  i(t),  and  all  {n^t)} 
are  stationary  zero-mean  Gaussian  processes,  with  the  result  that  all 
components  of  F  are  stationary  zero-mean  jointly  Gaussian  random 
variables.  The  joint  probability  density  of  F  can  then  be  written  as 


1  -  derived  in  Appendix  A. 


_ _ 1 _ 

(2ir)Mn  det  R  (0,$) 

where  T  and  *  Indicate  transposition  and  complex  conjugation,  respectively. 
The  elements  of  the  correlation  matrix  R  are 


(7)  p(p|e,<» 


-hlT  R_1(e,^)p* 

f 


(8>  “  55  v<v 

Appendix  B  represents  R,  det  R  and  R-1  in  detail. 

Cnee  the  form*  of  dec  R  and  R  *  are  established  they  sail  be  substituted 
into  the  probability  density  (7)  and  the  Cram&r  -  Rao  inequality  (1) 
can  be  applied  to  this  probability  density.  As  a  matter  of  terminology, 
one  may  wish  to  view  p(F|6,$)  as  being  a  likelihood  function 
1.(0)  or  a  conditional  likelihood  function  L(6|<f>). 


Ill  Results  for  Arbitrary  Spectra  and  Array  Geometry 

When  the  Cramir  -  Rao  inequality  (1)  is  applied  to  the  joint 
probability  density  (7)  without  any  new  assumptions  or  approximations, 
the  following  general  result  Is  obtained  after  many  tedious  steps: 


(9) 

n 

I 

k«=l 


(0-  6)2 


var  (8) 


,,.k.2  2  2 

k(S  )  cos 

2  k 

cz  D 


(1  +  db/de) 

Url  14 

z  z 


2(Ik)2 


D 


1-1  j-1+1 


(ri-xj)sinij 


+ 


C-4 


where 


(10)  Dk  -  (N*)2  +  MNk  (Sk+Ik)  +  Sklk  [M(M-l)-2 


M-l  M 
Z  Z 


COS„  ] 


(11)  sin^k  £  aln  u^810 


1-1  j-i+1  « 


6  -  ain 


(12)  coa^  *  coa 


E? 


(am  9  -  aln 


and  the  details  of  the  notation  are: 

8,#  :  target,  interference  bearings,  respectively 
(K^):  hydrophone  locations 
c  :  sound  velocity 

k  :  frequency  index  (l,...,n) 

i,j,p,q  :  hydrophone  Indices  (1,...,M) 

(Sk,Ik,Nk)  ■  [S(o>jt) ,  I (cjOj^)  .^(w^)  ]  :  signal,  interference, 
and  noise  spectra,  reap. 

We  shall  assume  that  the  observation  time  T  is  sufficiently  large 
ao  that  negligible  error  la  introduced  in  (9)  by  converting  the  sum  over 
the  frequency  Index  to  an  Integral  with  respect  to  frequency.  If  one 
multiplies  all  terms  In  the  sumatlon  over  k  in  (9)  by  the  factor 
(Tdtu/2ir) ,  which  equals  unity,  and  then  lets  Aw  -*■  du,  the  result  may  be 


written 


-  2 

(13)  (e  -  9  ) 


-  (1  +  db/de)  + 


f n 


*  C  D(bl) 


“  M-l  M  — 

±  i 

2I2(w) 

|  1  1  •lnJLj(w) 

i-1  j-l+l  - 

> 

D(u) 

N((tf)+MI(n)) 

N(w) 


M-l  h 
Z  Z 

i-1  j-i+1 


<ri~rj^ 


M-lM 

_M  fE  r  (-a|t  r< 
N(w>  L  ^y:1+1  'P-1 


PI <ri+r4> 


IV  Results  For  Weak  Signal*  Identical  Spectra,  and  Uniform  Spacing 
In  order  co  obtain  a  less  cumbersome  result,  ve  now  make  the 

following  assumptions  : 


(14)  K(u)  »  MS(w) 


-l4 


< 

(!)  •  0) 


<15)  [s(w)  ,I(w)  ,R(w)"|  *  jsG(w)  ,IG(w)  ,NG(w)  ,  -  w  - 


(G(u>)  arbitrary) 


(16)  rt  -  id 


i-1. 


• » 


M 


Assumption  (14)  states  that  the  coherent  sum  of  signal  power  from  all 
the  hydrophones  is  still  much  smaller  than  the  ambient  noise  power 
at  any  one  hydrophone.  This  assumption  permits  one  to  neglect  the 

V 

frequency-dependent  part  of  the  factor  D  (10).  Condition  (15) 
requires  that  the  signal,  interference,  and  ambient  noise  processes 
have  Identically  shaped  spectra  over  the  processed  frequency  band. 

This  assumption  is  Intended  primarily  to  simplify  the  complicated  inte¬ 
gral  in  (13),  but  It  is  actually  realistic  for  certain  ca8es  of  Interest. 
If  the  target  and  interference  are  similar  vessels,  for  Instance,  then 


C-6 


the  spectra  of  the  broad  band  acoustical  waveforms  emanating  from  them 
may  be  quite  eiailar.  The  ambient  noise  may  also  be  similar  in 
character  to  the  signal  and  interference  processes,  except  that  the 
noise  is  likely  to  have  a  broader  bandwidth.  The  assumption  (15)  is 
realistic  if  the  ratios  S(u)/I(tii),  S(u)/N(o>),  and  N(<i>)/I(<i>)  remain 
close  to  some  constant  values  throughout  the  processed  frequency  band. 
The  Integral  in  (13)  is  simplified  considerably,  since  one  in  effect 
replaces  the  spectre  [S(u),I(u),N(u>]  by  the  constants  (S,I,N), 
respectively.  Uniform  hydrophone  spacing  (16)  offers  a  typical  example 
of  linear  array  configurations,  and  it  converts  the  complicated  sums 
Involving  the  (r^>  into  polynomials  in  M. 

The  substitution  of  assumptions  (14) -(16)  into  the  general  result 
(13) ,  together  with  the  assumption  that  I  0  and  the  definition 
a)  ,  yields  the  following  result  s 


(17)  (S-SV  i  o  |..«'Wi7  A  , 

TS  w  J  d  cos  0 


2T 

|D 


M^-M 


M-l 


180 


3  4 


T  1 


P-1 


(M-p) 2  (y  cos(2py)  +  (py  1)  sin(2py)} 

2p 


I 

A 


o 

M-2  rl-l 


-  G, 


“"4  '1-X  — 1 

|2y  cost (p-q)y]+l(p-q)y2_  2 _ lsin[(p-q)yfl 

yJ  p-i  q-p+i  (p-<j)2  I—  \  (p-q)l  _i 


"V" 


— 1 


M-2  M-l 

-i-  •  C  E  CM~p)  (H-q)pq  j2y  eo«[ (p+  i)y]+  Up+ql»y2-2  I  »ln{(p+q)y]  | 

y3  P»i  Q“p+'  (P^>2  L  v  ji 


-  G„ 


•|L 


1  +  (H-2)  I_ 

N  -J 


.  1  .  M  s  |M3-2pM2-Mfp+p3||2Y 

y3  1  p-iL  p2  JL 


h4-m2 

36 


-  G. 

4 


‘  -2pM  -m-p+p  1 1 2y  coa(py)+(py2_  2  )  aln(pjr) 
2 

P-1U  P 


g+db/de)2  irc2p 

.J  3  ,2  2d 

IS  ui  c  cos  6 

max 


3  G5 


..  .ic  re 

<1.')  D  -  N"  +■  MK($+I)  +  H (M-l) SI 
w 


!9)  y  -  (sin  _  3in  , 

c 

'i  he  result  (17)  can  be  made  more  comprehensible  by  corns!  der  i.ui 


its  asymptotic  forms  for  large  y  (interference  remote  in  beam  ^ 
from  target)  and  small  y  (interference  near  target  in  bearing 


A.  Remote  Interference 


If  y  »  1,  the  oscillatory  expressions  G^.Gj,  and  may  be 
neglected  with  respect  to  Gq.  and  G^  may  be  neglected  with  respect 
to  (h-2)G^,  for  M  »  1.  The  lower  bound  given  by  (17)  is  then 
approximately 

(20)  (0-0) 2  m  -(1  +  db/de)2  it  c2p 

TS2<7  d2cos2  0 


Further  approximations  can  be  made  in  this  result  If  one  assumes 

either  that  ambient  noise  dominates  interference  or  vice-versa.  The 

factor  D(18)  takes  different  forms  in  these  two  limiting  cases. 

(21)  D  -  f N2+  MN(S+I)  MI  «  N 

|  MIN  +  M(M-l)  IS  MI  »  N 


Substituting  (21)  into  (20)  yields 


(1  +  db/dO)2  36  a  e2 


(22)  <e-e> 


2  > 


Ta)Jx  d2  cos26(M4- 


■M2)  |l+(M-2)i| 


(1  +  db/de)2  36n  c2 


To) 


max 


d  cos' 


:0£(M4-M2)_  8  M3+2mJ 


N+MSN  (MI«N) 

„2 


N2+(M-1)SN 


(MI»N) 


*  example  :  d*2ft.,  c  ■  5,000  ft. /Bee.,  umax*  2ir  x  5,000,  (sin6~sin<|i)«l; 
then  y«  (d/c)  wmjjx  (sine  -sin<J>)  -  4  it 


If  one  acts  I/N  equal  Co  aero  In  cha  noise -dominant  version  of 
the  above  inequality,  one  obtains  a  lover  bound  on  the  variance  of 
bearing  estimates  in  the  absence  of  Interference.  Note  that  the 
lover  bound  in  the  interference-dominated  case  is  almost  identical 
to  the  lover  bound  in  the  no-interference  case.  This  condition  obtains, 
of  course,  only  when  the  interference  is  remote  in  bearing  from  the 
target.  When  the  number  of  hydrophones  is  large,  the  effect  of  a 
remote  interference  is  equivalent  to  the  loss  of  2/5  of  a  hydrophone  *. 
Note  that  we  have  assumed  that  the  etrength  of  the  interference  has 
negligible  effect  on  the  bias  b(6) . 

B,  Near  Interference 

it  it 

If  y  <  1/M,  one  may  replace  the  oscillatory  functions  In 

(17)  by  the  first  two  teroB  of  their  respective  Madaurir  aeries. 

When  the  target  and  Interference  are  very  close  In  bearing, 


(23) 


A  *  7  > 
(6-0;  - 


(1  +  db/de)*  36  * 


cZD 


TS 


3  A2 

u  d 
max 


cos  9 


(M4-:*2)  Q  +<n4-M2H  •  xi 


N  10 


] 


see  Appendix  C  for  explanation. 

**example  ;  d*  1ft.,  c  -  5,000  ft. /sec.,  w  *  2ir  x  5,000,  (sin0  -  sinij>)“ 

max 

.003,  M*  50;  then  y  ■  (d/c)  u  (sln6  -  sinijt)  * 

max 

6tt  x  10-3  <  1/50  =>  .02 


C-10 


To  find  che  results  for  Che  noise-dominant  and  interference- 
dominant  cases,  we  substitute  the  expressions  for  D  in  (21)  into  (23): 

r 


(24) (6  6) 


2  > 


(1  +  db/d0)2  36  it  c2 


Tu>  3  d2C0826(M4-M2)j  1  +(K4-M2)I2 

max  1  —  •  * — 


tr-msN  (mi«n) 


.  •  y- 

R2  10 


J 


(1  4-  db/d9)2  360  it  c2  [N+(M-1)S]2  1  (I»H  , 

V  ~  r\  i  n  1  X  *  ~r ' T 


™5  2  5  5  2  2 

Tti)  d  cos  e(M  -M  )(M  -1) 
max 


y>>3  .N) 


M2  T 


(1  +  db/de)2  36 it  C2  M  .  l[N+(M-l)S]  (M1»N  , 

’  5  26(h4-m2)  s2  y«3_  r  > 

M2  I 

By  setting  (I/N)  equal  to  zero  in  the  first  line  of  (24) ,  we  obtain 

a  lower  bound  on  the  variance  of  6  in  the  absence  of  interference. 

* 

The  last  line  sets  a  lower  bound  on  the  variance  of  8  when  the 
target  and  interference  are  essentially  coincident  in  bearing.  A 
strong  interference  at  the  same  bearing  as  the  target  is  seen 
to  increase  the  variance  of  0  by  a  factor  of  approximately  (lil/K)  , 

The  we  ale  interference  versions  of  (22)  and  (24)  closely  resemble 
the  no-interference  result  given  by  Eq.(35)  of  Progress  Report  iTo.37. 
The  primary  object  of  Including  the  weak- interference  results  in  (22) 
and  (24)  is  to  indicate  the  first-order  effect  of  an  interference  on 
target  bearing  estimation  accuracy.  The  situation  which  presents 
considerable  practical  difficulties  is  that  in  which  a  strong  inter¬ 
ference  is  present. 

C.  Strong  Interference 

By  substituting  the  strong-interference  version  of  D(21)  into 
the  general  result  (17) ,  one  obtains  the  lower  bound  for  the  strong- 
interference  case  with  arbitrary  y  : 


Tu  d  cos 
max 


C-ll 


(25)  (8-8) 2  • 


(1  -fdb/d8) 2  *  C2  .  ..  H2 
■>2  2  “  2 
Tu  d  cob  8  S 

_ aax 

5  V  Vs 

y  y 


V  Split-Beam  Tracker  Performance 

Progress  Report  No. 29  presents  results  for  the  variance  of 
bearing  estimates  obtained  in  the  presence  of  a  single  point-source 
interference  with  a  modified  version  of  the  split-beam  tracker.  We 
shall  repeat  some  of  those  results,  which  are  valid  for  the  following 
conditions  ; 


1.  signal  weak  with  respect  to  ambient  noise. 

2.  uniform  hydrophone  spacing. 

3.  signal,  interference,  and  ambient  noise  spectra  flat  over 
the  processed  band. 

A* 

In  the  split-beam  tracker,  the  estimate  8  is  obtained  by  varying 
the  steering  angle  60  to  determine  for  what  value  of  0O  the  output 

A 

z  equals  zero.  This  value  of  80  is  then  taken  as  8.  The  variance  of 

2 

z,cz,  can  be  derived  as  a  function  of  8,0O,4,  the  array  geometry,  and 
the  spectra  of  the  Bignal,  interference,  and  noise  processes.  Then 
the  variance  of  e,  o'  ,  is  given  approximately  by  the  formula 

u 


(26) 


9 

o 


e 


This  formula  is  valid  if  9z/96Q  is  essentially  constant  for  8Q  in 

the  interval  (6-o~ , 0+c’) . 

U  o 

From  equations  ( 38-42, 8fc, 90)  of  Progress  Report  No. 29  we  have  the 
the  following  results  (for  white  or  prewhitened  spectra): 

The  notation  used  here  differs  somewhat  from  that  used  in  Progress  Report  No. 29. 


C-12 


Below  8 one  critical  value  of  y,  depending  on  all  the  parameters  in 

2 

(30).  Including  a.  the  expression  for  in  (30)  becomes  Invalid 
because  the  condition  for  the  validity  of  (26)  is  no  longer  net. 

The  form  of  the  latter  expression  In  (30),  however,  suggests  the 
very  reasonable  conclusion  that  it  essentially  impossible  to  estimate 
a  target  bearing  extremely  close  to  the  interference  bearing,  using 
the  implementation  under  consideration.  It  should  be  noted  that  (30) 
becomes  invalid  for  any  value  of  y  when  | cos  d|  falls  below  some  critical 
value,  again  because  the  condition  on  the  validity  of  (26)  is  not  met. 

As  long  as  neither  y  nor  jcos  0|  is  too  small,  however,  (30)  gives 
an  accurate  and  meaningful  result. 


VI  A  Comparison  of  the  Split-Beam  Tracker  Variance  with  the  Cram&r-Rao 


Lower  Bound 


Figures  2  through  4  present  a  graphical  comparison  of  the  split- 

beam  tracker  variance,  according  to  (29),  and  the  Cram&r-Rao  lower 

bound  for  the  s trong-interference  case  according  to  (25),  for  various 

values  of  M.  In  plotting  the  Cramdr-Rao  lower  bound,  we  assume  that 

2 

db/d6  is  nuch  smaller  than  unity,  so  that  the  factor  (1+  db/d0)  in 

(25)  can  be  ignored.  This  assumption  i6  invalid  for  0  near  t  ir/2, 
but  it  should  be  valid  For  most  of  the  possible  range  of  0  simply 

because  in  a  good  estinutor  one  would  expect  bias  to  be  much  smaller 

than  a  radian  for  ail  values  of  0.  If  b(0)  is  a  well-behaved  function, 

the  condition  that  L(0)  is  small  implies  that  db/d6  is  also  small 

over  most  of  the  allowaL 1l  range  of  9.  The  lower  bound  which  we  have 

plotted  is 


C-14 


(31) 


CR 


-a  log  P(F|e,*) 


30 


It  should  be  remembered  that  the  results  for  the  split-beam  tracker, 

2 

designated  ®sbt  ,  are  also  Invalid  for  0  near  ±  x/2  and  for  y  very 

near  zero,  as  explained  at  the  end  of  the  previous  section. 

2  2 

Figure  4  displays  the  ratio  /  o^.R  as  a  function  of 

y  for  various  values  of  M.  In  general,  this  ratio  is  the  complicated 
quotient  of  equations  (29)  and  (25),  but  the  form  of  the  ratio  is  rela¬ 
tively  simple  for  very  large  or  very  small  y,  also  assuming  a  very 
small  slgnal-to-noise  power  ratio.  From  equations  (24)  and  (30)  we  have 


(32) 


2.67  (1  +  2 . 4/M)  y»l,  M»1 


y«l,  M»1 


i. 

This  expression,  of  course,  becomes  invalid  for  very  small  y,  because 
2 

asbt  becomes  Invalid.  It  does  Indicate,  however,  that  when  the  target 

and  interference  bearings  are  very  close,  the  variance  of  the  modified 

split-beam  tracker  estimate  greatly  exceeds  the  Cram&r-Rao  lower  bound. 

The  curves  for  both  the  split-beam  tracker  variance  (Fig. 2)  and 

the  Cram£r-Rao  lower  bound  (Fig. 3)  clearly  exhibit  an  overall  vari- 
-4 

ation  as  11  .  For  y  near  zero,  the  curves  rise  sharply  for  both  the 

split-beam  tracker  and  the  lower  bound. 


*  It  happens  that  3p(F|e,$)  /  36  •  0  for  0  ±  ir/2.  The  final  paragraph  of 
Appendix  A  explains  that  this  condition  implies  that  db/d0  ■  -1  for 
0  ■  ±  ir/2. 


C-15 


It  is  clear  In  both  cases,  however,  that  an  Increase  In  the  nuaber  of 

hydrophones  pushes  the  break"  value  of  y  progressively  closer  to 

zero.  In  other  words,  an  Increasing  nuaber  of  hydrophones  should 

permit  a  system  accurately  to  measure  target  bearings  progressively  closer 

to  the  interference  bearing. 

The  shape  of  the  split-beam  tracker  curves  varies  only  weekly  with 
the  nuaber  of  hydrophones.  The  lower  bound  curves,  on  the  other  hand, 
become  progressively  flatter,  beyond  the  steep  rise  for  small  y,  as  M 
increases,  that  is,  the  performance  of  an  Ideal  bearing  estimator  should 
be  essentially  independent  of  the  angular  separation  between  target 
and  interference  for  a  large  number  of  hydrophones  so  long  as  the 
separation  exceeds  some  small  minimum  value.  For  large  values  of  M,  the 
lower-bound  curves  can  be  approximated  very  well  simply  by  connecting 
the  asymptotic  curves  for  large  y  and  small  y. 

Fig. A  shows  that  the  ratio  of  the  split-beam  tracker  variance  to 

the  lower  bound  depends  only  weakly  on  the  number  of  hydrophones.  It  is 

true,  however,  that  as  M  increases,  the  split-beam  tracker  performance 

edges  slightly  closer  to  the  lower  bound.  Beyond  the  region  of  small  y, 

the  ratio  remains  less  than  about  A  for  an  array  containing  10  or  more 

hydrophones.  The  performance  of  the  modified  split-beam  tracker  is 

reasonably  good,  then,  unless  target  and  interference  bearings  are  too 

close.  The  minimum  angular  separtion  between  target  and  interference  for 

satisfactory  performance  can  be  expressed  roughly  in  terms  of  the 

beamwidth  of  the  array.  The  beamwidth  is  determined  by  considering  the 

* 

average  signal-derived  output  z  of  a  conventional  detector  as  a 
function  of  the  parameter  y',  which  is  defined  exactly  as  y  above, 
except  that  $  is  now  interprepted  to  be  the  steering  angle. 


C-16 


The  beamwidth  is  defined  in  term  of  y'  as  the  distance  between  the 

two  values  of  y'  for  which  z  falls  to  half  its  maximum  value,  which 

occurs  at  y'  *0.  As  an  arbitrary  standard  of  adequate  performance 

one  might  require  that  the  variance  of  a  practical  estimator  be  no 

greater  than  ten  times  the  Cram&r-Rao  lower  bound.  From  Fig.  4  one 

can  determine  the  values  of  y  at  which  the  curves  reach  the  value  10. 

These  values  turn  out  to  be  approximately  equal  to  the  beamwidth  of  the 

array,  assuming  a  flat  signal  spectrum  with  cutoff  frequency  u 

max 

Thus,  by  this  arbitrary  standard,  the  modified  spit-beam  tracker  offers 
adequate  performance  as  long  as  signal  and  interference  bearings  are 
separated  by  at  least  a  beamwidth. 

The  point  at  which  the  Cram&r-Rao  curves  in  Fig.  3  change  from 
steep  to  relatively  flat  can  also  he  described  quite  accurately  In  terms 
of  beamwidth.  The  change  occurs  at  approximately  1.75  times  the 
bandwidth. 

VII  Conclusion 

The  Cram&r-Rao  inequality  Indicates  that  the  presence  of  a  point- 
source  interference  raises  the  lower  bound  on  target  bearing  estimation 
variance  over  that  obtained  in  the  absence  of  any  interference.  If  the 
target  and  interference  bearings  are  separated  by  an  angular  difference 
of  more  than  approximately  twice  the  array  beamwidth  ,the  increase  in  the 
lower  bound  is  small.  In  fact,  for  a  large  number  of  hydrophones,  the 
increase  is  equivalent  to  the  loss  of  only  2/5  of  a  hydrophone.  If  the 
angular  separation  between  target  and  interference  is  on  the  order  of 
a  beamwidth  or  less,  the  Increase  is  substantial.  Uhen  the  target  and 
interference  bearings  coincide 'the  worst  casejjthe  lower  bound  is 


*A  conventional  detector  consists  of  an  array  of  hydrophones  followed  by  a  bank  of 

variable  delays  for  the  purpose  of  steering,  then  a  summer,  squarer,  and  lcrw-pass 
filter. 


KX/N  times  greater  than  in  the  no- interference  cam.  Sine*  in  the  no-* 

_4 

interference  case  the  lover  bound  varies  as  M  ,  the  lower  bound  for 

the  case  of  coincident  target  and  interference  bearing  varies  as 
-3 

M  .  In  theory,  then,  even  this  lover  bound  can  be  made  arbitrarily  snail 
by  making  M  sufficiently  large  ,  while  all  other  parameters  reaaln 
constant. 

The  modified  split-beam  tracker  discussed  in  Progress  Report  Mo.  29 
yields  a  bearing  estimation  variance  no  larger  than  4  tines  the  CramAr- 
Rao  lower  bound,  as  long  as  the  angular  separation  between  the  target 
and  interference  is  greater  than  roughly  twice  the  beanwidth.  If  the 
angular  separation  is  substantially  smaller  than  a  beamwldth, 
however,  the  bearing  estimation  variance  is  unsatisfactorily  large. 

If  the  target  and  interference  bearings  coincide,  this  implementation 
is  completely  incapable  of  measuring  target  bearing.  Thus,  although  the 
modified  split-beam  tracker  offers  reasonably  good  performance  when  the 
interference  is  remote  in  bearing  from  the  target,  it  is  unsatisfactory 
when  the  target  and  interference  are  very  close  in  bearing,  and  a 
different  implementation  must  be  sought  for  this  case. 


C-18 


Appendix  A:  Derivation  of  the  Crag&r  -  Rao  Inequality 


Begin  by  writing  the  following  equation,  which  is  In  effect  a 
definition  of  bias  b(8),  as  the  discrepancy  between  the  mean  value 
of  the  estimate  8  (x>  and  the  true  value  8: 


(A-l)  6(x)  -  |  6(x)  p  (x|  6,$)  dx  »  d+b(6), 


where  R^  ia  the  domain  of  x.  how  differentiate  both  sides  of  (A-l) 


with  respect  to  0: 

(A-2) 


3p(x|e,$) 

9(x)  - -  dx  ■  1  +  db/de 


R 


30 


Let  f (0)  be  any  function  of  0  which  is  not  a  function  x. 


(A-3) 


f<9) 


3p(x|e,$) 


dx  -  £(0) 


R 


30 


38 


|  Phi 


0  <j>)dx  *  f  (8)—  — (l)-0 
36 


Subtract  (A-3)  from  (A-l) ;  multiply  and  divide  the  integrand  by 

p(x|e,<j>); 

f  [  -  ]  3p(x|e,(|>) 

(A-4)  [  e(x)-f(9)J - 


dx 


30 


j  (e<*>  - 


f(6) 


8p(x|e,<^) 


p(x|e,$)  30 


p(x|©»4’)  dat 


I  [e(x)  -  f(0)J 


3iog  p  (x|e,$) 


30 


p(xj9  $)dx^  »  1  +  db/d0 


The  Schwarz  inequality  reads 


(A-5)  |  f2(x)dx  |  g2(x)dx>_  f  f(x)g(x)dxj 

D  D  '  O 


C-19 


Lee 


f  <i>  *  ««)  JVpUlG,*) 


and 


.  3 log  P\Jt|i  ,0)  - 

g(x)  £  - -  Vp(x|0,*) 

96 


2  *  *  2 
(A-6)  J  f e(x)  -  f(0)|  p(x|e,6)dx  |  31°8  P<^le»*> 

d  *  J  -  L  96 


Apply  (A-5)  to  (A-4): 

91og  p(x|< 


p(x|6  $)dx 


•  (1  +  db/d0)2 

This  Inequality  is  equivalent  to 

(a -J)  Ce(x)-f(e)]2  i 

L  »•  J 

Equation  (A-7)  holds  for  any  f(6).  Using  variational  techniques, 

I 

one  can  show  that  the  choice  for  f(0)  which  minimizes  the  left  side  of 
(A-7)  is  f(6)  »  0.  Thus  for  arbitrary  f(0), 


<A  e)[hx)-f(o) ]2  ;  j  - 

3iog  p(x|e,t) 

90 

The  right  side  of  (A-7)  Is  often  derived  as  a  lower  bound  on  mean 
square  error  (f  (?))*••>),  and  so  it  is.  It  should  be  emphasized,  however, 
that  if  bias  is  present,  mean  square  error  cannot  achieve  this  lower 
bound.  A  tighter  lower  bound  on  mean  square  error  is  as  follows: 


(A -9)  (6-9) 2  -  b2(-i)-»(fl-e5”i  b2(0)  + 


j^lgg  pfxJa_4) — j' 

Tie  rightmost  two  member*  of  (A-3)  are  seen  to  be  equivalent  to  (1) 

through  the  following  identity,  which  is  proved  in  various  textbooks:2 

1.  For  example,  Harry  L.  Van  Trees,  Detection.  Estimation,  and  Modulation 
Theory,  Part  1,  Section  2.4.  ~ 


C-20 


m-aumiwww  ,m» 


(A- 10) 

Slog 

2  .  ^z._iog  p(x|M) 

30 

SI 

30 

A  point  deserving  comment  is  the  question  of  what  properties,  if  any 
are  common  to  the  bias  functions  associated  with  all  possible  estimators 
0(x)  which  might  be  used  in  the  same  situation.  The  derivation  of  the 
Cramer-Rao  inequality  indicates  one  fact  about  the  derivative  db/db. 
Suppose  that  the  derivative  3p{x|e,d)/36  equals  zero  identically  for 
some  value  of  8,  which  we  shall  designate  as  0O j  independently  of  x* 

A 

According  to  (a-2) >  then,  unless  0(x)  is  infinite,  db/d8  must  equal 
-1  for  e  equal  to  0O,  sc  that  both  sides  of  (a-2)  *qual  zero.  Since  we 
know  that  0  is  restricted  to  a  finite  Interval,  the  possibility  of  an 
infinite  0(x)  is  unacceptable.  The  only  conclusion  that  can  be  drawn, 
therefore,  is  that  db/d0  must  be  -1  at  the  point  60  for  any  acceptable 
estimator  0(x).  This  conclusion  may  be  repeated  eynbolically  as 


(A-ll) 


3p(x{  '  ,i) 
30 


=  0 


3=*6e 


db 

de 


9*0C 


Note  chat  the  behavior  of  db/d0  away  from  the 
of  b(3)  at  all  points  are  in  no  way  specified 
on  the  specific  form  of  8(x). 


"1 

point  0O  and  the  behavior 
by  (a-11> •  These  depend 


Appendix  B:  The  Correlation  Matrix  R,  Its  Determinant,  and  Its  Inverse 


We  shall  state  the  Corn  of  the  correlation  natrlx.  Its  determinant 
and  its  Inverse  without  proof.  The  vector  P_  is  written  at  the  edges 
of  R  to  indicate  which  two  elements  of  _F  correspond  to  each  element 


1)1  S1-*-!1-**1  a12S  +b12I 


a^S^b^I1  S1+I1+N1 


«  4is*+b»il1 


-1..  1  T1 
’ ' ' a2M°  b2M 


.  ..S1+I1+N1 


elements 


all  elements  zero 


V"i> 


lsn+In+Nn - F]  (vfl) 


k  i 

'll  "  * 


•••  riiv-  vv 

y  =* 


^  -  (r^/c)  sin  0 


d^Cr j/c)  sin  i 


(Sk,Ik.Nk)  I<Wk)’  N(wk),:! 


9 ,d:  target,  interference  bearings,  respectively 

S(w),  I(w),  N (w) :  signal,  interference,  noise  spectra,  respectively 

T:  observation  time 


{ r . } :  hydrophone  locations 


c:  sound  velocity 

i,j:  hydrophone  indices  (1,  ,.,M) 
k:  frequency  index  ( ) 


Both  R  and  R  ^  have  the  property  that  only  the  elements  of  the 
MxM  diagonal  submatrices  are  nonzero.  Each  of  the  n  diagonal  submatrices 
corresponds  to  a  different  frequency  but  has  the  same  form. 

(B-2)  det  R  ■ 

Mn  rt  .  w  9  ?  .  vv  M-l  -«-3  k 

(T/2)  n(NK)K-,i:  {(IT)  +MNK(SK+IK)+SKIK[M(H-l)-2r  L.  cos  *  ]* 
k-1  j-i+1  J 

where 

ri"rl 

(B-3)  cos  k  6  008  C  - —  “k(8ln  0-ain  *)] 

J  c 

In  the  absence  of  interference,  det  R  is  independent  of  6,  but  with 
interference  present,  it  depends  on  both  0  and  d. 

The  elements  of  R-1  are  as  follows,  with  i,j  hydrophone  indices, 
and  k ,t  frequency  indices; 


(B-4)  R"1  35  (Nk)M_3 

i»k;j»-t  -  _ x 

T  det  R 


’  (Nk)2+(M-l)Nk(Sk+Ik)+SKlk[(M-l)  <M-2)-2  £  f  «s  M 

p-lq-p+1  Pq 

p.q^i 


(i-j)  ; 


C-23 


The  two  limiting  forms  of  the  result  (22)  Indicate  that  the  factor 
(M4-M‘ )  in  the  no-interference  case  (I/N**0)  changes  to  [M4-M2-(8/5)M3+2M] 
in  the  remote-interference  case.  The  equivalent  cost  x  in  hydrophones  is 
found  by  substituting  (M-x)  for  M  in  the  no- interference  expression  and 
setting  it  equal  to  the  remote-interference  expression. 


(C-l)  [(H-x)4-(M-x)2]  -  M4-M2-(8/5)M3+2M 

M4-4M3x  +...  -(M2  +...)  -  M4-M2  -  (8/5)M3  +. . . 


x  *  2/5  fi  :  M  »  1 

Thus,  having  a  strong  remote  interference  with  M  hydrophones  is  in  a 

sense  equivalent  tu  ..  v  ny.  no  interference  and  (M  -  2/5)  hydrophones . 


t  -  di 


target 


array  ^  rH 

origin 


Figure  1 .  Array  Geometry 


(Single  IwTEftfEfteNCEj 


iwfliMWHiwnrowwiwSwSw 


SPACE-TIKE  PROPERTIES  OF  SONAR  DETECTION  MODELS 


by 


James  Peyton  Gray 


Progress  Report  No.  41 


General  Dynamics/Electric  Boat  Research 
April  1970 


DEPARTMENT  OF  ENGINEERING 
AND  APPLIED  SCIENCE 


YALE  UNIVERSITY 


SUMMARY 


Space-tine  Properties  of  Sonar  Detection  Models 
James  Peyton  Gray 
.  April  1970 

A  measure  theoretic  structure  chat  is  general  enough  to  encompass 
most  models  of  signal  detection  is  used  to  investigate  singularities 
in  models  of  sonar  detection.  Singularities  that  appeared  in  previous 
sonar  work  are  shown  to  derive  from  simplified  modeling  of  sound 
generation  and  transmission.  The  existence  of  singularities  in  a 
model  of  sonar  detection  is  also  shown  to  seriously  restrict  the 
usefulness  of  that  model  in  investigations  of  sonar  array  design. 
Finally,  methods  for  avoiding  singularities  are  discussed. 


t 


CONTENTS 


1.  INTRODUCTION  1 

1.1  Sonar  Array  Design  1 

1.2  Array  Optimization  Algorithms  5 

1.3  Previous  Work  in  Singularity  of  Models  9 

2.  MODELS  OF  COMMUNICATION  AND  DETECTION  SYSTEMS  12 

2.1  The  Concept  of  Classes  of  Models  13 

2.2  Definitions  16 

2.3  Induced  Measures  20 

2.4  Two  Examples  of  Classes  of  Models  25 

2.5  Topologies  on  Classes  of  Models  28 

2.6  Application  to  Examples  34 

2.7  Performance  Criteria  and  Linear  Risk  38 

2.8  Expected  Error  and  Singularity  40 

2.9  Singularity  in  a  Non-factorable  Model  43 

3.  LINEAR  TRANSMISSION  AND  ADDITIVE  NOISE  46 

3.1  Additive  Stages  46 

3.2  Linear  Stages  53 

3.3  Multiplicative  Stages  54 

4.  APPLICATIONS  TO  SONAR  56 

4.1  The  Physics  of  Sound  Transmission  57 

4.2  The  Two  Variable  Wave  Equation  63 

4.3  The  Four  Variable  Wave  Equation  69 

4.4  Passive  Sonar  in  One-space  78 

4.5  Passive  Sonar  in  Three-space  81 

4.6  Implications  85 

5.  SUMMARY  88 

5.1  Contributions  of  this  Work  88 

5.2  Possible  Directions  for  Further  Work  89 

BIBLIOGRAPHY  91 


D-iii 


1.  INTRODUCTION 


The  problem  addressed  in  this  dissertation  did  not  just  materialize 
out  of  thin  air,  but  rather  evolved  during  an  investigation  of  algo¬ 
rithms  for  sonar  array  design.  In  this  chapter  the  sonar  array  design 
problem  will  be  analyzed  and  the  relevance  of  later  chapters  estab¬ 
lished.  The  link  between  the  analysis  and  the  subsequent  work  is 
the  question  of  singularity  of  models  of  passive  sonar  detection  as 
the  number  of  hydrophones  Increases  without  bound.  In  the  limit, 
this  means  continuous  observation. 

1.1  Sonar  Array  Design 

Passive  sonar  detection  systems  extend  from  sensors  (hydrophones 
used  as  acoustic  energy  *o  electric  energy  transducers)  to  outputs 
which  range  from  simple  audio  for  human  interpretation  to  complex 
situation  displays  of  diverse  kinds.  Too  complex  to  be  designed  as 
a  whole,  these  systems  must  be  partitioned;  the  traditional  parti¬ 
tioning  allows  a  single  lead  out  of  the  array  subsystem  into  the 
post-array  processor  and  assumes  that  the  spatial  processing  is  to 
be  done  in  the  array  subsystem  and  time  processing  of  the  resulting 
signal  is  to  be  done  by  the  post-array  processor.  This  yields  a 
factored  sonar  detection  system  in  the  sense  of  Middleton  [1] 

(Figure  1.1).  That  is,  there  are  two  operators:  one,  the  array 
processor,  is  a  function  of  hydrophone  position  and  signal  location, 
but  is  independent  of  signal  and  noise  statistics  (the  signal  is 
assumed  to  be  a  stationary  point  source).  The  time  operator  is 


D-l 


<? 


dependent  upon  signal  and  noise  statistics,  but  is  independent  of 
the  array  geometry  and  target  location.  This  factorisation  leads  to 
advantages  in  design,  construction  and  operation  that  are  manifest. 

In  many  cases  these  advantages  more  than  offset  the  sub-optimal 
performance  of  the  factored  system. 

Under  these  conditions,  designing  the  array  means  placing  the 
hydrophones  and  combining  their  outputs  to  maximize  some  figure  of 
merit  at  the  array  subsystem  output.  Array  output  SNR  is  often  used, 
with  delayed  summing,  i.e.,  beam-forming,  aa  the  processing.  Alter¬ 
natively,  if  the  signal  is  confined  to  a  narrow  frequency  band, 
beam  width  or  the  ratio  of  main  beam  level  to  maximum  sidelobe 
level  may  be  used.  Although  the  cost  is  increased  considerably, 
the  beam  width  and  side  lobe  levels  can  be  partially  controlled  by 
introducing  shading  factors.  Other  variations  in  the  array  proces¬ 
sing  that  have  been  investigated  include  multiplying  the  outputs  of 
several  phones  together  before  delayed  summing  (Shearman  [1])  and 
hard  limiting  the  signals  before  delayed  summing  (Anderson  [1], 

Usher  [1],  Schultheiss  [1]). 


Array 


D-2 


Within  this  basic  beam-forming  approach,  relatively  little 
attention  has  been  paid  to  the  question  of  hydrophone  location. 

For  Instance,  Skolnik,  et  al  [1]  used  dynamic  programming  (Bellman 
[1])  (even  though  the  principle  of  optimality  does  not  hold)  to 
position  hydrophones  In  a  linear  array  for  "best"  beam  patterns. 

Their  paper  contains  references  to  related  design  approaches.  The 
more  usual  design  method  is  to  assume  an  ad  hoc  array  geometry  and 
then  to  compute  beam  patterns  and  shading  factors  (Lovensteln  [1]), 
sometimes  with  explicit  inclusion  of  inter-phone  acoustic  coupling 
and  the  directionality  of  each  phone.  This  computation  is  so 
involved  that  an  optimization  algorithm  (for  phone  position)  based 
upon  it  is  impractical. 

Even  if  the  computation  involved  were  reasonable,  conventional 
beam  pattern  optimization  suffers  from  a  serious  defect:  the  beam- 
forming/  time  processing  type  of  detector  is  not  optimal.  In  fact, 
the  optimal  detector  factors  into  separate  space  and  time  processors 
only  under  very  restrictive  assumptions:  if  the  spatial  processing 
Is  to  be  done  first  as  in  conventional  beam-forming,  a  strong  signal 
assumption  must  be  valid  (Middleton  and  Groginsky  [1]).  If  the 
detector  is  to  be  optimum  for  small  signals  then  a  portion  of  the 
time  processing  must  be  performed  first  (Goode  [1],  Bryn  [1]),  and 
the  observation  time  must  be  long.  This  fact  is  especially  pertinent 
today,  when  the  optimal  processor  can  be  implemented  digitally  as 
a  special  purpose  computer,  for,  there  is  no  reason  to  believe  that 
an  array  designed  for  beam-forming  Is  the  best  array  for  use  with 
an  optimal  processor.  Consider,  for  instance,  a  noise  source 
located  within  the  volume  available  to  the  array:  a  beam-forming 


D-3 


algorithm  would  put  all  the  phone*  at  a  distance  from  the  aolaa 
eouree,  while  an  optimal  processor  would  place  oae  phono  right  on 
top  of  the  local  noise  and  then  subtract  the  local  noise  from  the 
other  phones. 

One  man  has  considered  array  design  with  optimal  processing; 
N.T.  Gaarder,  In  his  dissertation  and  two  closely  related  papers 
(Caarder  [1][2][3])  has  found  optimal  radii  for  circular  point 
detector  arrays  when  a  liklihood  ratio  processor  is  used  on  the 
array  outputs.  Unfortunately,  his  results  depend  in  an  essential 
way  upon  a  trick  evaluation  of  the  eigenvectors  of  the  covariance 
of  the  noise.#  For  our  purposes,  however,  the  chief  defect  in 
Gaarder' s  work  is  the  assumption  of  isotropic  noise;  an  assumption 
vhich  he  considers  essential  to  the  analysis  (Gaarder  [2]  page  48) . 


#  This  trick  Is  the  same  one  used  earlier  by  Vanderkulk  [1) 
and  depends  on  the  point  detectors  being  arranged  at  the  k  roots 
of  -1  in  the  complex  plane.  Gaarder  does  not  seem  to  have  been 
aware  of  either  Vanderkulk  [1]  or  Bryn  11],  at  least  he  does  not 
reference  them  in  Gaarder  [2]  or  [3]. 


D-4 


1.2  Array  Optimisation  Algorithms 

Why  has  Caarder  been  the  only  one  to  consider  array  design 
with  optimal  processing?  Simply  because  any  general  approach  must 
founder  on  the  shoals  of  numerical  analysis.  In  order  to  illuminate 
this  point,  we  will  Cry  to  formulate  an  array  optimization  algorithm 
for  likllhood  ratio  processing  of  the  array  outputs. 

Let  the  observed  pressure  field  be  £(Xj,t)*5j(t)  at  each  of  k 
point  sampling  hydrophones  where  xj  is  a  vector  in  3-space.  A  set 
of  n  linear  functionals  { f^}  can  be  applied  to  £j(t)  to  derive  a  set 
of  observation  coefficients.  These  can  be  arranged  in  a  single 
vector  n: 

V*ej,fi>  m-l+n(j-l) 

We  can  form  the  likli-hood  ratio: 

A (n) "1 rob (nt{ signal+noise} ) /Prob (ne{noise  alone }) 
Assuming  that  signal  present  and  signal  absent  are  a  priori  equally 
probable,  assuming  small  SNR,  and  assuming  Independent  Gaussian 
signal  and  noise,  a  well  known  computation  yields 

A'“21nA(n)  =  (n,R~1QR_^ri) 

where  Q  and  R  are  the  covariances  of  the  signal  and  noise  processes, 
respectively.  A'  is  a  function  of  the  n*k  observations  nm«  but  it 
is  also  a  function  of  the  3k  coordinates 

x-{xj>  J“l. . .k 

For  small  signals  at  the  input,  the  output  signal  to  noise  ratio 
Is  a  measure  of  the  detector's  performance.  Some  algebra  gives 
SNRk(x)-Tr(R"1QR"1Q)/((TrR"1Q)2+2Tr(F*1QR-1Q)) 


D-5 


Ibis  expression  reveals  the  essential  difficulty  that  any 
optimisation  algorithm  must  overcomet  the  Independent  variables, 
x»{x^}  J»l,...,k  enter  the  function  SNR^x)  through  a  matrix 

inversion.  This  means  that  the  principle  of  optimality  does  not 
hold,  so  that  a  simultaneous  optimization  in  3*k  variables  is 
necessary  instead  of  the  k  optimizations  in *3  variables  that  could 
be  handled  easily*  Meedless  to  say,  an  analytic  derivation  of  the 
extreme  points  is  very  difficult,  and  has  been  carried  out  only 
by  Caarder  [2]  and  only  in  a  very  special  case. 

The  computation  of  SNR^(x)  at  a  single  point  x  is  a  formidable 
task,  especially  since  the  values  of  k  which  are  of  practical 
Interest  are  in  the  100+  range.  Since  the  inversion  of  a  matrix 
much  larger  than  20  by  20  is  generally  conceded  to  be  possible 
only  after  extensive  study  of  the  particular  case,*  a  straight 
forward  computational  approach  would  be  a  massive  undertaking. 
Prodigious  amounts  of  machine  time  and  several  man-years  of  effort 
could  be  consumed  with  no  guarantee  of  success  except  for  small  k. 


*  Wilkinson  [1].  Matrix  inversion  algorithms  fail  on  matrices 
that  are  ill-conditioned;  correlation  matrices  are  generally  ill- 
conditioned.  Extensive  study  of  a  given  matrix  would  be  needed  to 
estimate  the  deg’-eo  of  ill-conditioning  an'1  adapt  an  inversion 
algorithm  to  it.  i,.ecial  techniques,  such  as  the  use  of  programmed 
multiple-precision  arithmetic  might  be  needed  to  do  the  job.  All 
of  this  implicitly  assumes  long  detection  times  so  that  the  linear 
functionals  can  be  Fourier  transform  coefficients.  In  this  special 
case,  R  will  consist  of  n  uncoupled  k  by  k  submatrices  on  the  main 
diagonal  leading  to  a  much  easier  inversion  of  R  than  is  the  case 
for  a  general  R 


D-6 


I  mm  lag  ««t  **14  "The  purpoM  of  ceap«tlt|  Is  Insight...” 
(Hissing  [1]}.  At  the  very  least,  then,  before  making  an  investment 
of  the  magnitude  estimated  above,  one  would  like  to  have  a one  hope 
of  a  substantial  reward  of  insight.  In  this  case,  one  would  like 
to  know  that  the  optimized  array  would  offer  substantially  improved 
performance.  This  means  knowing  that 

At"™*  SN\(x>  ~  SNRkCx0) 

is  at  least  5  or  6  db,  where  xQ  represents  any  reasonable  array 

geometry-uni formly  distributed  hydrophones  for  instance. 

It  Is  clear  that  max  SNR^(x)  is  a  strictly  increasing*  function 
x 

of  k.  One  way,  then,  to  estimate  Afc  *or  a  particular  signal  and 

noise  distribution,  would  be  to  substitute  lim  max  SNR^  in  place  of 

k  x 

max  SNR^.  Unfortunately,  the  limiting  value  is  less  available  than 
the  maximum.  If  the  limit  exists,  though,  can  be  estimated 
thusly:  taking  L>>k, 

Ak"SNRL(xO)  "  SNRk(*0)  +  AL  +  EkL 
Aj-xnax  SNRl(x)  -  SNR^(x0) 

Cjj^max  SNR^(x)  -  max  SNR^(x) 

where  increases  to  0  as  k  and  L  go  to  infinity  and  decreases 

to  0  as  L  goes  to  infinity.  For  fixed  k  and  large  enough  L,  then, 

A^  is  approximated  by 

V-WEkL-  SNVxo>  ‘  SNRk<xo> 


*  Let  3t  be  the  point  at  which  SNR^  ^(x)  attains  its  maximum. 
Add  an  additional  hydrophone  at  xk,*x4  f°r  x4e— ’ 

Since  it  picks  up  some  signal  power,  its  output  improves  the^ 
performance  of  the  optimal  detector,  so  that  there  exists  x*  for 
which 

mgx  SNR^(x)>-SNRk(x*)>max  SNR^  (x) 


D-7 


with  an  error  which  la  positive  for  large  enough  L.  This  weans 
that  If  we  compute  a  '  end  It  la  snail,  then  there  surely  is  not 
enough  reward  to  justify  a  large  Investment  in  array  design,  at 
least  for  that  particular  signal  and  noise  model.  On  the  other 
hand,  if  A^'  is  large,  then  we  can  not  be  certain  of  a  substantial 
reward,  but  we  can  be  hopeful^"  In  words,  this  estimate  is  derived 
by  comparing  the  detection  performance  of  systems  with  moderately 
dense  and  highly  dense  hydrophone  arrays,  where  the  arrays  are 
both  in  the  same  volume. 

The  underpinning  of  this  heuristic  calculation  is  the  existence 
of  l£m  max  SNR^(x)  *.  Or,  put  another  way,  the  model  must  be  non- 
singular  in  the  limit  of  continuous  observation.  This  first  step 
is  not  trivial;  many  models  of  passive  sonar  detection  are  singular 
in  the  limit  of  continuous  observation.  The  body  of  this  disser¬ 
tation  is  devoted  to  understanding  why  this  is  so,  and  determining 
how  to  avoid  it. 


*  The  particular  expression  given  for  SNR(x)  is  valid  only  for 
small  signal  at  the  output  of  the  detector.  The  difficulty  in 
evaluating  still  remains  in  any  figure  of  merit  for  the  optimal 
detector,  however. 


D-8 


1.3  Previous  Work  In  Singularity  of  Models 

In  the  last  section  we  saw  that  If  a  nodcl  Is  singular  (i.e., 
if  detection  is  perfect)  In  the  limit  of  continuous  observation 
then  it  will  not  be  useful  in  array  design.  We  are  led  by  this 
route  to  consider  singularities  in  detection  models. 

A  model  is  a  mere  semblence,  a  mathematization  of  a  portion 
of  reality.  As  a  construct  it  can  best  be  judged  by  its  fruitful¬ 
ness;  as  an  image  of  the  real  it  must  be  judged  by  the  faithfulness 
of  its  representation.  Now,  singularity  as  a  property  of  the  mathe¬ 
matical  model  is  neither  good  ncr  bad  but  merely  interesting. 

When  considered  as  a  reflection  of  reality,  however,  it  is  an 
ef front  to  our  sensibilities  -  experience  teaches  that  nowhere  is 
there  perfection,  everything  is  fuzzy  around  the  edges,  nothing 
works  perfectly.  Therefore,  a  singular  model  can  not  be  a  faithful 
representation  of  the  real. 

Of  course,  singularity  is  a  fascinating  subject  in  its  own 
right,  but,  to  deepen  our  understanding  of  the  world  we  need  models 
of  greater  faithfulness,  which  in  detection  theory  means  non¬ 
singular  ones.  The  tension  implicit  in  this  statement  is  reflected 
in  the  literature  dealing  with  singularity:  it  has  been  studied 
by  two  nearly  disjoint  sets  of  workers.  On  the  one  hand  stand  the 
mathematicians  and  closely  related  types  (e.g.  Yaglom  [1]). 
Generally  speaking  this  group  has  concerned  itself  with  conditions 
for  orthogonality  or  equivalence  of  Gaussian  measures  (so-called 
structural  questions).  This  problem  was  solved  to  a  purist's 
satisfaction  by  Feldman  [1],  who  proved  that  two  Causslan  measures 
are  either  orthogonal  or  equivalent.  This  result  was  proved,  at 


D-9 


least  partially,  at  about  the  same  time  by  llajek  [1],  A  less 
general  form  of  tills  theorem  had  been  proved  earlier  by  Crcnander 

ID.  in!  ftini  rl/i  rtt  ic*i!  by  lianpuc  «ii»d  OoiHsilrn  Jlj  with 

the  aid  of  the  theory  of  reproducing  kernel  Hilbert  spaces,  (Aronsajn 
[DID/  the  essential  ideas  ati  they  relate  to  clngulavlt  V  <  t  i‘  )  J, 
of  detection  are  discussed  In  section  3.2. 

Recognizing  that  Feldman's  conditions  are.  too  abstract  for 
ees.V  application,  a  slightly  more  practical  group  of  usailu-wjticJ ana 
has  tiled  to  derive  different  (oi.  filiations  ot  Feldioun  i  h-.. ,.  •.  e;i. 

Conti  1 1  i i  as  i.n  terms  of  the  covariances  of  the  Gaussian  processes 
defined  by  the  treasures  would  be  infinitely  more  useful.  Feldtv.a 
Id],  hlksei :  and  Vurberg  fl}  have  i.anaged  to  dr,  this  foi  spec  la’ 
cni.es  3 do..;.';*  re-.tilf.s  are  exs  ensintiu  of  ones  provided  hy 

Slcplan  U)  whi n  one  oi  the  processes  has  a  rational  spectra!  density. 

Shepp  {.*  ]  :;{  tint  too  zero  mean  Gaussian  pj  i.c.us.-.cs  o  .  ... 

th.  <  t.  :  .  i  .  .  ••  ca.  •  are  .  ,  ■  ■  .1  ,  i.' : . 1..’  ,1,;!  i..i  y 

if  C;i:  :  1  !.'('■  >'!  t  h-.  ’  I  I  .ir.-'es  ..r  .  },T-  i-i'i 

Many  atult.M  ..  n.i  :a  written  on  some  aspect  of  this  queat  Ion ,  e  . . 

Shepp  I  J  l  ij,  '•■  •>  ID. 

fill'  ...  :  ,.I.  •;  •  ...  .  iv, 

define-.!  oil  .-..a  ail.-,  i  ^  :  ,iv;  ,•!  a  j.l  a,;  it-  real  v...  1. 

Ana  .  ■  .  .  i  :  •<  •  '•  .•’!  ■>  '  :  e.  !  ••..  •  -  .  * 

thf.j  !  •  j.  . ;  ;  ...  . . .  rt  .. 

rear  i  ‘  ..  .  .  .  ,.  :  .  . :  ..  .'  ; ...;  •  .  '  .  eve 

dV 


Best  Available  Copy 


With  the  exception  of  Sleplan  [1]  (and  he  Is  not  an  engineer  by 
training)  the  singularities  have  variously:  been  accepted  without 
question  (Martel  and  Mathews  [1]  ) ;  been  considered  relatively 
unimportant  (Vanderkulk  tl],  Caarder  [2]{3]);  been  eliminated  by 
addition  of  a  white  self-noise  at  the  detector  (Root  [1]).  The 
white  noise  solution  to  the  singularity  dilema  works  well  enough 
when  functions  of  a  single  variable  are  under  consideration,  but, 
as  both  Vanderkulk  and  Gaarder  show,  array  based  detection  models  can 
be  singular  as  k  increases  to  infinity  even  in  the  presence  of  white 
noise  at  each  point  detector,  although  detector  performance,  as 
measured  by  the  array  gain,  does  increase  extremely  slowly  with  k. 
This  is  a  puzzling  behavior,  and  although  it  raises  serious  questions 
about  the  adequacy  of  the  models  that  are  being  manipulated*,  neither 
author  offers  a  discussion  or  rationalization. 

There  has  not  been,  '.hen,  a  satisfactory  treatment  of  model 
singularity,  especially  for  non-gausslan  problems  and  for  sonar 
models.  The  remaining  chapters  offer  a  treatment  of  singularity 
In  models  of  detection  and  communication  which  is  useful  in  sonar 
and  which  is  independent  of  the  random  processes  involved. 


*  If  detection  becomes  perfect  as  the  number  of  phones, increases, 
how  can  one  be  sure  that  the  model  is  close  to  the  real  world,  unless 
a  comparison  with  a  better  model  has  established  a  range  of  validity 
for  the  simpler  one? 


D-ll 


2 


MODELS  OF  COMMUNICATION  AND  DETECTION  SYSTEMS 


That  system  analyses  are  performed  upon  models  of  reality  is 
obvious.  It  is  less  so  that  problems  more  difficult  than  perfor¬ 
mance  evaluation  deal  not  with  models  but  with  classes  of  models. 

For  example,  questions  of  system  sensitivity  force  consideration  of 
all  those  models  which  are  close,  in  some  metric,  to  a  given  one, 
while  system  synthesis  means  attempted  maximization  of  a  performance 
criterion  over  a  collection  of  models.  These  facts  make  an  explicit 
discussion  of  classes  of  models  desireable. 

The  basic  theme  of  this  chapter  is  the  Introduction  of  classes 
of  models,  which  is  accomplished  in  sections  2.1  and  2.2.  Some 
basic  mathematical  problems  are  discussed  in  section  2.3  and  then 
two  examples  are  presented  in  section  2.4.  In  2.5,  several  methods 
of  topologizing  a  class  of  models  are  discussed.  In  section  2.7  common 
performance  criteria  are  derived  and  applied  in  an  example.  Finally, 
singular  models  are  defined  and  their  effects  evaluated  in  section 
2.8.  The  final  section,  2.9,  is  an  example  of  singularity  sneaking 
into  a  non-factorable  model 

A  moderate  background  in  the  measure  theoretic  development  of 
probability  theory  is  required  for  this  chapter.  Knowledge  of  Halmos 
[1]  or  Kingman  [1]  is  sufficient.  In  addition,  the  development  of 
stochastic  processes  as  probability  measures  on  appropriate  spaces 
of  sample  functions  is  assumed.  The  first  and  last  chapters  of 
Parthasarathy  [1]  contain  relevant  material.  Notation  is  standard, 
or  is  defined  as  it  appears.  Distribution  is  used  synonymously 


D-12 


with  probability  measure.  Operator,  map  and  transformation  are 
used  to  mean  function.  Supp(y)  is  any  support  of  the  measure  y, 
while  supp(y)  Is  the  closed  support  of  y,  that  is,  the  smallest  of 
all  of  the  closed  sets  which  support  y. 

2.1  The  Concept  of  Classes  of  Models 

By  suppressing  all  detail,  a  communication  or  detection  (C/D) 
system  can  be  modeled  by  an  operator  which  maps  a  set  of  source 
messages  into  a  set  of  estimated  source  messages.  See  Figure  2.1. 
But,  it  is  the  detail  which  is  of  interest  to  the  analyst;  this 
internal  structure  may  be  modeled  conveniently  by  a  series  of 
composed  operators,  as  in  the  example  displayed  in  Figure  2.2. 
Models  of  particular  C/D  systems  may  require  different  operators, 
but  the  general  structure  shown  is  sufficient  to  represent  all 
open- loop  C/D  systems. 

Since  C/D  systems  are  probabilistic  in  nature,  the  operators 
must  be  stochastically  determined;  this  is  depicted  in  Figure  2.3. 
The  model,  taken  as  a  whole,  must  then  be  a  probability  space,  but, 
one  with  a  rather  complicated  internal  structure.  A  point  in  this 
space  for  the  example  consists  of  a  single  source  character,  a, 
and  four  operators,  g,  n,  t,  e,  so  that  ft,  the  estimated  signal,  is 
related  to  a  by 

&**(gonotoe)  (a) 

Another  point  might  consist  of  the  same  original  message  a,  fol¬ 
lowed  by  a  different  set  of  operators,  g^,  n^,  t^,  e^,  so  that 

ai“*8lonlotloel^  ^ 


D-13 


Cocanunication/Detection  Systems 

Figure  2.1 


Encode  Transmit  Receive  Dec0Je 

operator  operator  Operator  Operator 


Source  Chance!  Receivable  PecolveS  Eott.at.d 

Character  Si6nal  Signal  signax 


More  Detailed  Description  of  a  C/D  System 
r..„„n^oino  noon  a  Message  Character  a 


Figure  2.2 


D-14 


Spaces  of  Operators 

Source  Encode  Transmit  Receive  Decode 


Spaces  of  Signals 


General  Model  of  C/D  Systems  with  Explicit 
Representation  of  Randomness 

Figure  2.3 


D-15 


This  representation  of  C/D  models  leads  naturally  to  consi¬ 
deration  of  classes  of  models.  By  limiting  membership  in  the  sets 
A,  W,  V,  R,  E,  T,  N  and  G,  a  particular  class  of  models  can  be  • 
constructed  which  embodies  the  constraints  imposed  on  the  C/D 
system  by  the  outside  world  (in  jargon,  by  the  supra-system  of 
which  the  C/D  system  is  a  component) .  Different  measures  p  then 
represent  different  models.  Questions  of  optimization,  sensitivity 
and  the  like,  which  imply  a  reference  class  (find  the  optimal  system 
in  this  class  of  systems)  can  now  be  discussed  with  the  universe 
of  permissable  systems  explicitly  represented. 

2.2  Definitions 

Having  motivated  everyone,  it  is  time  to  be  more  precise. 

We  begin  by  formalizing  the  discussion  given  in  the  preceeding  section 
Let  be  a  complete  metric  space  with  metric  di(*,*)  and  obtain 
a  measurable  space,  also  denoted  by  where  no  confusion  can  result, 
by  generating  a  a-field  B^  from  the  open  sets  in  S^.  Elements  of 
are  Borei  sets  of  and  is  the  Borel  o-field  of  S^. 

If  is  a  space  of  measurable  mappings  of  S2]c-i  *nto  S2k+1 
for  k-1,2,3,  ...  L  and  p  is  a  probability  measure  on  the  measurable 
product  space 

s-s  x  s  x  s  x...x  S2L  ■  n  s, 

1  2  4  JcE  J 

where 

E={1}  union  { 2,4,6, .. .2L) 
then  the  2L+2  tuple 

m-(si,s2,...,s2L41,u) 


D-16 


is  called  a  level  L  C/D  model.  It  will  also  be  called  an  L-stage 
(C/D)  model. 

That  this  definition  is  not  vacuous  is  demonstrated  by  the 
following  example.  Let  S^S^CR.B) ,  that  is,  the  real  line  under 
the  standard  metric.  Elements  of  will  be  additive  translations: 

if  t  eS.  then 

a  2 

t  :S,-*-s,,+a«S„  for  all  aeR 

a  1  1  3  — 

S2  becomes  isometric  (and  isomorphic  with  respect  to  addition, 
although  that  does  not  concern  us  here)  to  and  if  we  take 

d2(ta<tb)“'a'b* 

hence  S^,  and  are  all  complete  separable  metric  spaces. 

Since  elements  of  S£  are  clearly  measurable  mappings,  every  4-tuple 
M*s(S^,S2,Sj,u)  where  p  la  any  measure  on  S-S^xS2  is  a  level  1  C/D 
model  or  a  one  stage  C/D  model. 

For  odd  j>l,  the  measure  p  should  in  some  sense  induce  a 
measure,  call  it  p j ,  on  .  Let  CCT^)  be  a  measurable  cylinder* 
set  in  S  with  base  in  Define  (for  even  i  only) 
W1(Ti)=p(C(Ti)) 

so  that  p^  is  the  marginal  measure  on  the  measurable  space  S^. 

Now  extend  this  notation  to  prism  sets*.  If  F  is  any  subset  of 

indices  from  the  set  E,  then  C(T  )  is  the  prism  set  defined  by 

c 

C(Tp)=  n  C(T, ) 

F  jeF  J 

*  Cylinder  set  as  defined  in  Cramer  [1],  page  17.  Prism  set  is 
his  rectangle  set  when  the  index  set  F  contains  only  two  indices. 


D-17 


where  the  C(T^)  are  all  cylinder  Bets* 
wf(tf>bu(C(tf)) 


Obviously, 

* 


defines  a  marginal  measure  on  the  measurable  product  space 
Taking 


n 


I^^13 , 2u] 


for  all  n>l 


we  are  now  able  to  define  an  induced  measure  for  odd  k  by: 

*kC®SJ'  *t1(t21°';41«---<,tkil«“Ek.1(dtldt2-"',tk-l) 

j'4-i 


(E2.1) 


where  x,.  (2)  is  the  indicator  function  of  the  set  Z  (that  is, 

1 

Xt(Z)*=l  if  teZ,  and  0  otherwise).  This  definition  holds  whenever 
the  integral  exists.  If  the  integral  fails  to  exist,  wk  is  unde¬ 
fined.  C/D  models  of  such  a  pathological  nature  are  Interesting 
in  the  same  way  that  the  plague  is:  both  are  to  be  avoided.  There 
are  interesting  mathematical  problems  here;  they  are  discussed 
in  section  2.3. 

An  important  simplification  in  this  relationship  occurs  when 

V  is  a  product  of  marginal  measures,  that  is,  when  the  stochastic 

operations  of  the  model  are  independent.  In  that  case, 

u-u. u,y. • • .pOT "  n  u.  (E2.2) 

1  2  *  jcE  J 

and 


Vk<Q)“  ^lJlc_2^t~1^Vk_1(dt)  for  all  k  odd>l 

Sk-1 

A  C/D  model  with  this  property  will  be  termed  factorable. 

When  the  target  space  Sk  is  R*1,  the  distribution  is  equivalent 
to  a  density  mk  (allowing  tht  density  to  be  a  generalized  function): 

nij^Cx)"  S  mk_2(t’‘1x)uk_^(dt)  (E2.3) 

Sk-1 


D-18 


(E2.4) 


If  has  an  associated  density: 

Hk<Q)“  /  Pk-2(t“1Q>^t-i(t)dt 
Sk-1 


and  if  both  m.  _  and 


k-2  and  Vl 


exist: 

.-I. 


”k(x)“  J  V  2<t  xJn^COdt 

Sk-X 


(E2.S) 


Conditional  probability  calculations  play  an  Important  role 
in  detection  theory,  so  it  will  be  well  to  consider  conditioning 
of  C/D  models.  Suppose  that  a  particular  character  has  been 
transmitted,  and  define 


E£n“{2,4, . . . 


^n-11 


2n} 


and  let  t  be  elements  of  S*  .  If  J\i 
—  2n  e 

si 


Pk+1<Q)“  r  Xa(t-1Q)yg(dt) 
Sk 


we  see  that 


is  the  measure  induced  on  S,  conditioned  by  transmission  of  a. 

k+1  7 

Extended  to  set  conditioning:  (for  p^(R)>0) 

Vk+1<Q)-(h1(H))"1  /Pk+1(Q)h1(da) 

R 


D-19 


2.3  Induced  Measures* 

In  section  2.2,  C/D  models  were  defined  which  consist  of  an 
original  apace  of  randomly  chosen  characters  and  a  number  of  spaces 
of  stochastic  operators  which  take  the  source  characters  through 

j 

a  series  of  transformations.  Now  we  want  to  deal  with  the  measures 
induced  on  the  spaces  for  odd  k>l.  Under  easily  satisfied 
conditions,  the  measures  defined  by  E2.1  will  be  induced,  that  is, 
the  integral  will  exist  and  so  will  the  induced  measure  defined  by 
that  integral. 

The  most  expeditious  approach  to  this  subject  requires  a 
certain  redirection  of  our  thought.  To  begin,  we  note  that  an 
L-stage  model 


(Sp.^, '  *  * » ** 2 L+l 

is  more  than  an  arbitrary  2L+2  tuple.  An  assumed  structure  exists: 

every  even  index  space  consists  of  operators  which  map 

into  or  put  another  way,  for  k«2J ,  j*l,2,...,L  there  is  a 

nap  4^  (5^  ^*8^)  l°t0  Sfc+1  which  is  defined  by 

<fk:  (u,v)  »v(u)cSk+1  for  ell  (u,v)c(S  .S^) 

The  set  {$_  :  j«»l,2, .  . .  ,L) ,  one  map  for  each  stage  of  the  model, 

2J  k 

represents  the*  totality  of  ways  in  which  source  characters  are 
transformed  into  detected  characters.  liy  modifying  the  definition 
of  slightly,  this  can  be  made  more  explicit.  Let  0^  map  (S.S^^) 
into  (S,S^+j)  according  to  the  rule:** 


*  This  section  has  benefited  greatly  from  conversations  with 
Prof.  M.  Kean-'-  of  the  Yale  Mathematics  Department. 

**  We  remember  that  S“(S^,S2«S^, . . .  , S^) 


Best  Available  Copy 


D-20 


8l(S,St..1>-G'(S,Sk_l,Sk) 

-<S,*k<Sk-l>Sk>> 

*<S'W 

In  every  essential  way,  then,  0^  and  4^  are  equivalent.  In  particular, 
6}.  is  a  measurable  map  iff  4^  is. 

Going  one  step  further,  let's  compose  several  0k  starting  at  the 
first  stage,  k*=2: 

<V‘ekoek-2°'  •  •o0ao02 

We  see  that  maps  (S,S^)  into  (S, Sk+^)  for  each  even  k  up  to  2L. 

is  measurable  if  each  4  j ,  j<-k  is  measurable,  since  composition 
preserves  measurability.  Tk  is  not  in  exactly  the  form  we  would 
like,  however.  To  get  that  form,  let  v  be  the  binary  selection 
operator;  if  x  is  any  n-tuple,  x=(xj,x2, • • • ,xn)  where  n  may  be 
infinity,  then 

jVX=Xj 

Mow  define  T,  as 
k 

rk<S).2v'fk(S,S1).2v(s,sk+1)-sk+i 
rk  expresses  the  way  in  which  Sk+^  is  mapped  into  by  the  model. 

Mote  that  is  measurable  iff  is,  with  the  result  that  I*k  is 
measurable  whenever  4j  is  for  all  j<=k.  For  reference,  we  state 
this  as  a  theorem: 


THEOREM  1  is  measurable  if  4j  is  measurable  for  all  J<Bk 

The  existence  of  Induced  measures  on  Sk+^  can  now  be  expressed 

66: 


D-21 


THEOREM  2  If  is  measurable  then  the  integral  in  E2.1  exists 

and  ij,  ,,  it  the  measure  induced  on  S 

k-+l  k+1 

PROOF  Measurability  of  means  that  defined  by 

|ik+^(Q)"W (T~^Q)  is  a  measure.  But  this  is  just  E2.1 

The  previous  theorems  depend  entirely  on  the  product  topology 
assumed  for  the  models.  As  a  direct  result,  we  need  only  consider 
the  measurability  of  trying  to  derive  conditions  applicable 

to  whole  models  which  will  guarantee  the  existence  of  all  induced 
measures  in  the  models.  The  first  result  is  an  extension  of  the 
well  known  case  when  S2  is  a  single  operator: 

THEOREM  3  $  is  measurable  whenever  is  a  separable  discrete  space 

PROOF  S2  must  consist  of  a  countable  number  of  points  {b^. 

Letting  Q  be  a  measurable  set  in  S-j,  b"^Q  is  a  measurable 
set  since  all  are  measurable.  So, 

^~1Q-U(b'1Q,b1) 
is  measurable. 

This  is  true  for  completely  arbitrary  spaces  S^.  Countability 
of  S^,  on  the  other  hand,  is  not  sufficient  to  ensure  measurability 
of  4>.  For  instance,  let  S^"{1>,  S^'fO.l}  and  S2(D)a»{bx:xcR*} 
where  the  maps  ate  defined  by 

bx(l)={l  if  xcD,  and  0  otherwise} 
and  D  is  any  subset  of  the  extended  real  lira  jt*.  Metrics  on 
and  are  trivial;  on  S2  let 
d2<bx'by5“lx-yl 


D-22 


so  that  is  Isomorphic  and  isometric  to  R*.  If  D  is  any  non- 

measurable  set  then  4>-1({l})  is  non-measurable,  even  though  each 

operator  bx  in  82(D)  is  measurable. 

The  problem  in  this  example  is  that  bn+b  in  S2  does  not  imply 

that  b  (x)-*-b(x)  in  S,.  This  difficulty  need  not  arise... in  fact, 
n  ■> 

the  following  theorem  shows  that  point-wise  convergence  of  the 
operators  in  S2  is  sufficient  for  continuity,  hence  measurability 
of  4>.  Necessary  conditions  for  the  measurability  of  4  remain  to 
be  discovered,  however. 

THEOREM  4  If  the  maps  in  S2  ate  continuous  and  convergence  in  S2 
is  polntwise,  then  $  is  continuous,  hence  measurable. 
PROOF  Let  {a,}be  a  subset  of  S^,  a^a  and  {b^}  be  a  subset  of 

S2>  bj+b.  Then 

djt'Ka^b.. )  .4(a,b))=d^(bj  (a^)  ,b(a)) 

<-d3(b;j(ai)  ,bj  (a) )+d^(bj  (a)  ,b(a)) 
<e(a±)+e (bj) 

where  the  first  term  converges  to  zero  by  continuity 
of  the  b^  and  the  second  by  the  point-wise  convergence 
In  $2-  This  shows  that  4  is  a  continuous  map  of 

As  an  important  example  of  a  situation  where  S3  is  a  space  to 

which  Theorem  4  applies,  let  and  be  metric  spaces  and  S2  the 

set  of  all  continuous  maps  of  into  which  have  bounded  ranges, 

meaning  that  the  range  of  each  bcS2  can  be  covered  by  a  single  ball 

of  finite  radius.  S  becomes  a  metric  space  if  the  metric  is: 

2 


5 


i 

S' 

i- 

t 


| 

| 

i 

I 

j 


Verifying  the  metric  space  axioms: 

1.  d2^bl,b2^>“°  and  -0  iff  b.j“b2 

2.  d2(b1,b2)-d2(b2,bi) 

3.  ^2<'bl,b2^+<i2^b2‘b3^>"d2^1*'33^ 
which  follows  from 

(Vbl’b3)“SUp  d3<bi<a>»b3<a>> 

*«sup (d3(b1a,b2a)+d3(b2a,b3a)) 

<-sup  d3(b^a,b2a)  +  sup  d^O^a^a) 

A 

-d2<b1,b2)+d2(b2,b3) 

As  an  easy  consequence  of  these  definitions  we  have: 

THEOREM  5  If  is  complete,  so  is 

As  a  complete  metric  space,  Sj  supports  a  Borel  o-field  and 
®1*  S2’  S3  ^orm  a  class  C/D  models  (when  taken  over  all  measures 
of  weight  1  on  the  measurable  product  space  S) .  Since  convergence 
in  S2  is  pointwise,  Theorem  4  applies  and  a  measure  p3  is  induced 
on  by  the  model.  Many  other  spaces  have  a  suitable  topology 
also.  For  instance,  if  and  $3  are  Banach  spaces  then  S2  can  be 
taken  as  the  space  of  noraed  linear  operators  from  into  S^,  and 
Theorem  4  will  apply. 

The  results  given  here  are  not  inclusive,  but  they  serve  our 
immediate  needs:  all  of  the  models  used  in  the  sequel  will  satisfy 
the  conditions  of  Theorem  4.  In  the  obvious  cases,  no  mention 
will  be  made  of  this  fact. 


11-24 


2.4  Two  Examples  of  Classes  of  Models 

Examples  serve  to  clarify  general  concepts,  60  several  will 
be  presented  using  the  ideas  introduced  in  sections  2.1  and  2.2. 

Detection  of  Gaussian  Signals  in  Additive  Gaussian  Noise 

Consider  the  problem  of  detecting  a  Gaussian  signal  obscured 
by  additive  Gaussian  noise,  but  otherwise  unaffected  by  transmission. 
A  model  for  this  situation  follows:  (the  Synonym  column  references 
section  2.1  while  the  Spece  column  references  section  2.2) 


Space 


S 


1 


S 


2 


S 


3 


S 


4 


S 


5 


Synonym 

A 


E 


W 


N 


R 


Meaning  and  Definition _ 

The  set  of  source  characters.  Here,  the  set 

{1,0}  interpreted  as  {signal,  no  signal}. 

Encoding  operators.  Here  the  signals  are 

Gaussian,  so,  take  E  to  be  the  set  {e  }  where 

h 

h  ranges  over  (abstract)  Hilbert  space,  H. 

Then,  for  aeA,  eh(a)=a*h. 

The  channel  waveforms.  In  this  case,  the  Hilbert 
space  H. 

Noise  operators.  The  noise  is  additive 

Gaussian,  so  take  N  to  be  the  set  {n  :heH) . 

h 

For  weW,  n^CwJsw+h. 

Received  waveforms.  Again,  just  H,  which 
recurs  several  times  in  this  example  because 
the  transmission  mapping  is  the  identity 
operator. 


D-25 


Space 


Synonym  Meaning  and  Definition 


G  Detection  operators.  Frequently  the  detection 

operator  is  not  stochastic;  in  this  example 
assume  that  it  is  not  and  take  G»{g}  for 
6ome  fixed  g:R-*A. 

A  Estimates;  the  set  {1,0},  interpreted  as 

{signal  present,  no  signal  present). 

Specification  of  p  completes  the  model.  Assuming  independence 
(see  E2.2)  we  have 

vavXV2V*\ 

where 

pjL  is  discrete,  p^{i)*=p^ 

P6  is  degenerate,  u^{g)«l 

v>2  and  p^  are  independent  Gaussian  distributions. 

The  Induced  distributions  p3,  p^  and  p^  are  also  of  interest. 
p3  is  not  Gaussian  because  of  the  spike  of  mass  pQ  at  the  origin. 

Of  course,  p®  ,  the  distribution  in  conditioned  by  a  value  in  S-^ 
is  Gaussian.  The  same  holds  for  p^:  unconditioned  it  is  not  a 
normal  distribution,  but  conditioned  by  a  S^,  it  is.  Finally, 

'i7^i^"’di’  the  detccCion  probability  of  a^eS^. 

If  H  has  finite  dimension,  then  it  is  isomorphic  and  isometric 
to  Rn.  Suppose  then  that  p^  is  the  distribution  which  has  auto¬ 
covariance  P  and  mean  p  while  has  autocovariance  Q  and  mean  q. 

The  densities  are: 

m2 <h ) - 1  / /cOO^TpT)  exp (- Ch-p , P'1  (h-p) ) / 2) 
D4(h)“l//((2n)n|Qj)  exp(-(h-q,Q  1(h-q))/2) 


D-26 


In  order  to  derive  p  ,  recall  that  x  (F)  is  the  Indicator 

J  z 

function  of  the  set  F  and  is  defined  by 

X  (F)»{1  if  zeF,  0  otherwise} 
z 

Also,  let 

Fq«F-{0} 

Then 

P3(F)=  /  ^(e^Fjp^de) 

S2 

-  /  ji1(e“1F0)vi2(de)  +  XQ<F)  I  vi1(e_1{0})n2(de) 

S2  S2 

”P1  f  u2(de)  +  P0*0(F) 

F0 

or,  in  terms  of  densities: 

m3(x)“p1m2(x)  +  Pq6 (x“0) 

Similarly,  y  can  be  expressed  as 

P5(F)*»  /  u3(n  1F)u^(dn) 

S4 

»>c(x)“  /  p,(x-n)y4(n)dx 
S4 

«Ta3*m4(x) 

■  Pjm2*tn^(x)  +  pQm^(x) 

Known  Waveform  in  Additive  Gaussian  Noise  Communications 

This  example  is  a  slight  modification  of  the  previous  one. 
becomes  {0,1,2,. .. ,n}  interpreted  as  the  set 

{no  character, character  51, ... .character  5n) 


D-27 


We  take  p^(i)rap^  as  before,  but  the  encoding  operator  Is  a  fixed 
map  e:A-*Wrll,  so  that  \ir/  in  degenerate  at  the  point  e.  This  reflects 
the  known  waveform  assumption,  and  N  are  unchanged,  but  the 
induced  distribution  is  more  complex,  being  conditionally  Gaussian 
at  each  of  the  source  characters.  We  may  leave  y,  unchanged  although 
the  detector,  g,  is  a  different  operator.  Finally,  y_(i>«d^  as 
before. 


2.5  Topologies  on  Classes  of  Models 

When  a  class  of  models  is  to  be  manipulated,  advantage  can 
often  be  taken  of  structural  properties  of  the  class.  For  example, 
suppose  that  a  performance  criterion  n  is  defined  on  a  class,  C, 
cf  models .  (by  this  we  mean  that  n  is  a  bounded,  real-valued 
function  on  u-  sti  i..  lor  more  detail,  see  section  2.7)  If  C 
lias  only  the  discrete  topology,  then  selection  of  the  model  which 
maximizes  !*  can  be  done  only  by  a  straight  search.  On  the  other 
hand,  if  C  Jr.  a  n  ii  cd  linear  space,  a  gradient  search  technique 
can  be  ue  ..  <  i  ass  of  models  is  really  a  set  of  measures, 

it  is  » h  •  i  .it*. «  ;  structure?,  on  sots  of  measures  which  must 

he  Investigated.  Wc  ill  look  at  two  basic  topologies  for  sets 
cf  probability  . . asercs .  One  well  known  method  embeds  them  within 
a  liana  ch  s;  ...  •  •  t  while  the  other  method,  due  to 

Kokutani  [1],  embeds  li.iu  within  abstract  Hilbert  space. 


D-28 


Best  Available  Copy 


Embedding  Within  a  Sup-norm  Banach  Space 

It  is  well  known  that  the  probability  measures  can  be  given  a 

metric  derived  from  a  Banach  space  of  all  finite  measures.  Here  we 

look  at  one  Banach  space  based  on  the  sup-norm.  Let  M  be  the  set 

of  all  signed,  finite  measures  on  a  measurable  space  (X,S).  M  is 

a  linear  space  over  the  reals,  and  defining 

||y||«  /  dp  +  /  dp  *  /  dj p | 

X  X  ~  X 

ve  can  easily  verify  that 

1.  ||p|i>~0,  and  =0  iff  psO  true  zero 

2.  | |a*p|  j*|a| • j |p j |  scalar  multiplication 

3.  | |p+v( j<*| |p| |+1  |v| |  triangle  inequality 

Proof:  d|p+v|<«'d]pl+d|v| 

so  that  M  is  a  normed  linear  space.  It  is  also  a  Banach  space 
since  it  is  complete: 


THEOREM  6  M  is  complete 

PROOF  Let  (p1}  in  M  be  a  Cauchy  sequence,  y^pcM  where  M  is 

the  completion  of  M.  .We  wish  to  show  that  p  is  actually 
in  M  so  that  M»M.  But, 

1  |u| |<-|  Ip-pJ  1  +  1  \v±\  1 

and  since  is  bounded  (it  is  Cauchy),  p  is  finite. 

It  remains  to  show  that  p  is  countably  additive. 


Letting  {B^}  he  a  sequence  of  disjoint  sets  in  S,  set 

vK'.v  -  *  "“i1 

i"l  i*l 


Now  choose  a  subsequence  of  the  u^. 


call  it  p^,  so  that 


D-29 


is  non-increasing.  (set  p^-*-p^  if  necessary)  Choose 
a  further  subsequence  p  from  the  sequence  y.  so  that 

H  K 

I  |Pn“l'M<cn/2n 

for  an  arbitrary  c>0.  Then  it  is  clear  that 

.  Z  |(b  -P)B,|<=.  "  ||u  -p||-cn(l-2‘n) 
i~i  i-1  n 

Since  p^  is  countably  additive, 

n  n  n  n 

Q  “ | U (  Z  B  )-M  (  l  B .  )+p  (  Z  B  )-  Z  pB  I 

n  1  n  1  1  i»i  1 

n  n 

<"|(p-Un)(  Z  B  )|+  Z  |(p-u  )B  | <  cn 
i-1  1  i-1  n  1 

Choosing  Q  -0  and  p  is  countably  additive. 

n*oo 

The  probability  measures  form  a  subset  P  within  M.  Since 
probability  ,,.i.  p^  ufu  distinguished  by  dp  >•  0  and  by  /du^-l 
or  ||p^  |J«1,  tl»i;.  subset  is  a  small  portion  of  the  unit  sphere  in  M. 
P  satisfies  Li.  .  r  ric  space  axioms  if  d(y ,v)-|  |p-vl  | ,  P  is 
bounded  si  net 

t]  (ji  ,  v)  *  ‘ti  (  ,  Oj’f  <1  (0  ,  v)-2 

Fur  then. ore,  i  is  e<.|Je<e  since  { ti  ^ }  in  P  witli  p^p  means  that 

I  i^i  i'1-!  i  A  i  i  H  i  m  *  PA  |  | 
i  Inii'H  IkJ  1-1  1 

which  show;:  tL.it  ;j|.  jj-1. 


h-30 


Best  Available  Copy 


Lebesgue-Stieltjes  Measures  and  the  Space  M' 

When  (X,S)  is  (Ft  ,Ln),  that  is,  Lebesgue  measurable  sets  on 
n-dimensional  Euclidean  space,  the  space  M  ia  especially  interesting 
since  its  elements  can  be  represented  explicitly  as  Lebesgue- 
Stieltjes  measures.  In  applications,  it  is  almost  always  this 
space  which  is  used  since  explicit  calculations  can  be  made  with 
relative  ease.  Often,  (e.g.,  with  Gaussian  distributions)  interest 
is  centered  on  distributions  which  are  continuously,  or  at  least 
piecewise  continuously,  differentiable.  These  form  a  dense 
manifold  M'  within  M(Rn)  which  is  not  closed  under  the  M-nonn. 
Introducing  a  simple  sup-norm  on  the  derivatives,  however,  makes 
M'  itself  into  a  Banach  space.  The  set  P'  derived  from  P  in  a  like 
manner  is  a  subset  of  M'  of  course.  While  P  is  bounded,  P'  is  not, 
although  it  is  a  metric  space.  However,  P'  is  complete: 

LEMMA  1  If  p'-»p'  in  P'  then  p  -+P  in  P  where  p  =/ p',  p “/p" 
rn  *n  *n  n’  r 

PROOF  The  integral  is  continuous. 

THEOREM  7  ¥'  is  complete. 

PROOF  Let  {p'}  be  a  Cauchy  sequence  in  P',  that  is,  p'-t-p'EP", 

the  completion  of  P'.  p"  is  piecewise  continuous  because 
of  the  sup-metric  in  P'  .  By  lemma  1,  /p*l  so  that 
p'cP',  or,  P"“F',  as  required. 


0-31 


Embedding  Within  Hilbert  Space  (Kakutanl  [1]) 

The  set  P  defined  on  the  previous  page  can  also  be  embedded 

within  a  Hilbert  space  in  order  to  form  a  metric  apace  P„.  Consider 

u 

first  the  case  p'w  and  define 

p(h,v)"  // (du/dv)dv 
X 

From  Schwarz's  inequality  we  can  show  that 

1.  0<p(u,v)<»l 

2.  p(u,v)*=l  iff  uBv 

3.  p(vi,v)*p(u,m) 

Proof:  p(u,v)=*  // (dp/dv)dv“  /(l//  (dv/dp))  (dv/dy)du 
X  X 

m  //(dv/du)dp«=p(v,v) 

X 

Now  we  can  obtain  a  quasi-metric  (i.e.,  a  metric  without  the  triangle 
inequality)  by 

o(p,v)“-Ln  p(p>*0  =  -Ln  // (du/dv)dv 

X 


since  we 

have 

1. 

0<no(p ,  v)<oo 

2. 

iff  y=v 

3. 

o(jj,v)-o(v,ii) 

only  the  triangle  inequality  foils  at  times  to  hold.  It  is  the 
logarithm  in  the  definition  of  o  which  distorts  the  "distance” 
surface.  For  instance,  let  and  suppose  that: 


‘1 

*2 

V 

3/4 

1/4 

V 

1/2 

1/2 

w 

1/4 

3/4 

D-32 


t 

f 

Direct  calculation  shows  that 

p(p»v)*p(v,w}»  0.965 
pCp.oj)"*  0.816 

so  that 

o(p,v)  +  o(v,w)  •  0.092  <  0.14  ■=  o(p,w) 
which  is  a  clear  failure  of  the  triangle  inequality. 

How  let  be  arbitrary.  We  do  not  assume  that  p~p'. 

Choose  a  third  measure  veP^  such  that  p<v,p'<v  .  (p+p')/2  is  one 

such  element,  but  there  may  be  many  others.  Now  define  two  elements 
of  L2(X,S,v)  by  - 

(dy/dv)  (w)  ¥'(u))«=/ (dpVdv)  («) 

o 

Both  V  and  ¥'  are  on  the  unit  sphere  in  L  ,  and,  when  p'vp'  it  is 
clear  that 

pCp.V*)*3  // (dp/dy')dp'*  //(dp / dv)  /(dp ' / dv) dv-  (V,?') 

X  X 

2 

where  (¥,4'')  is  the  inner  product  of  ¥  and  V'  in  L  . 

This  relationship  provides  a  natural  extension  of  p(p,p') 
to  those  cases  where  it  is  not  the  case  that  p”vp'.  It  is  also 
clear  that  p(u,p")  so  defined  is  independent  of  v  so  long  as  U<v 
and  u'<v.  Also,  only  when  p_[p'  does  pCpjp'J^O.  Finally,  since 

|  ly-Y'IIH  Mi2+ii’*,'i  |2-2('t',,r)«=2(l-p(p,p')) 

we  see  that  P  can  be  made  into  a  metric  space  by  defining 
H 

d(p,u')=|  ]f-f  '|  i  **/ (2  ( 1-p  (p ,  P  '  ) ) ) 

Since  this  metric  is  independent  of  the  choice  of  \>,  P  has  been 

n 

isometrically  embedded  into  abstract  Hilbert  space. 

The  quasi-tnetric  a  can  also  be  useful,  especially  for  Gaussian 
measures,  since  it  induces  the  same  topology  on  P  os  does  d(»,*). 


D-33 


This  follows  since  there  exist  two  constants  and  k2  such  that 
k^cKu  ,jO<*-d2(u,p')<-k2o(|i,y‘') 
for  cither  o(ii,y-)  or  dCu.u')  sufficiently  small. 


2.6  Application  to  Examples 

It  has  only  been  possible  to  apply  the  metrics  of  the  previous 
section  to  concrete  numerical  examples  in  the  case  of  independent 
models,  where  each  marginal  measure  can  be  treated  independently. 

The  whole  model  can  then  be  treated  as  one  point  in  the  appropriate 
product  metric  space,  although  we  do  rot  explicitly  do  that  here, 
but  arc  content  to  develop  the  metrics  for  the  individual  marginal 
measures.  The  forms  of  the  P-space,  P'-space  and  P  -space  (Hilbert) 
metrics  are  developed  for  countable  spaces  and  for  one- dimensional 
Gaussian  distributions. 

Sequence  Spaces 

For  distributions  on  denumerable  spaces,  such  as  on  in 
the  examples  of  section  2.4,  we  define  p^“y(i).  Then  it  is  clear 
that:  (define  q^»v{i}) 

Denumerable  P-space  Sup-metric 

I  I  VJ-v  1  |*max|p,-q  | 

i 

Denumerable  P^-space  Hilbert  Metric 

d(n  ,  v)»[2-2p  (y  ,v)  ]1^Z-[2-2E/p7/qT  ]  “[l(»/p7  '^qT  )21^2 

1  1  1  i 


Gaussian  Case 


Gaussian  distributions  on  Rn  are  also  interesting.  Let  F 
and  G  be  Gaussian  distributions,  means  m„  and  nu,  variances  o_  and 

r  u  f 

Cq,  and  vlth  densities  d?  and  dG.  We  define  and  Q2  to  be  the 

roots  of  the  quadratic  equation  in  the  variable  Qs 

Ln  dF(Q)»Ln  dC(Q)  (E2.6) 

Since,  for  Ke{F,G>, 

dK(Q)=  (o^«^2it  )  1  exp[-(Q-mK)2/2o£] 
we  have  this  form  of  E2.6i 

Ln  Oj,  +  (Q-mF)2/2o|  =  Ln  aG  +  (Q~mG)2/2o2 
If  then  q2"a>  (Figures  2.4  and  2.5) 

Gaussian  Measures  in  P-space  over  R^ 

In  order  to  obtain  an  expression  for  the  metric  in  P,  we 
want  to  form  (F-gICR1).  uooking  at  the  nine  cases, 

(oF  <>“,>  oG)x(mF  «,«,>  mG) 

we  see  that  0j,<oG  means  that  dF>dG  between  and  Q2  no  matter 

What  values  m^  and  raG  have  and  that  dF<dG  between  and  whenever 

When  dF^dG  up  to  if  my<mG  and  dF<dG  up  to  if 

my>mr.  If  then  dF*dG  everywhere. 

Consider  case  #1:  o  >o„ 

F  G 

d(F,G)= i F-G | (R^)=/d ) F-G| 

Q.  Qn  +CO 

•I  (dF-dG)  +  /  (dG-dF)  +  /  (dF-dG) 

-co  q2 

Defining 

+00 

erf(x)-  /  /2tT  exp(-£?/2)d£ 
x 


D-35 


we  see  that 


f 


y 

//2t\  ^  exp(-?2/2)d£  »  1-erf (y) 

-co 


so  that 


y 

// 2r  exp(-£2/2)d5  ■  erf (x)-erf (y) 

x 


d(F,G}**  -2erf[(Q^mF>/oF]  +  2erf [ 

+2erf t  (Qj^-Og)  /oG]-  2erf [  (Q^-ny,) taQ\ 


and  since  o_<a.  changes  the  signs,  case  02  gives  the  negative  of  the 
t  c 

above  equation  so  that,  for  ovto  : 

d  (F ,  G)  =2 1  er f  [  (Q^-nij,)  /cp]  -erf  [  (Qj-ny.)  !°g ] 

+erf [ (Q2-mG) /oG]-erf [ (Q2-mf) /ap] | 

while  if  o  =a_  then 
r  b 

d(F,G)=2|erf[ -mG  /  2a  ]-erf[  m£  /  2a  ]| 

These  two  equations  define  d(',')  for  any  Gaussian  measures  in  P(R^) 


Gaussian  Measures  in  P '-space  over 

While  the  P-space  distance  is  relatively  messy,  the  P'  distance 
between  Gaussian  distributions  has  a  particularly  simple  form. 
Letting  f  and  g  be  the  frequency  functions  of  the  distributions 
F  and  G  we  have 

<3(f,e)c  sup j  f (x)-g(x)  | 

X 

-  max{f(mF)-g(mF) ,g(nG)-f (mG)} 

»{f  (rayJ-gCnip)  when  aF<=aG  and 
g(mG)-f(mG)  when  °F>*!0G) 


D-37 


Gaussian  Measures  in  P^-space  over  R2 

Finally,  we  evaluate  the  metric  derived  from  embedding  the  set 
P  within  Hilbert  space.  Computing  p(*,‘)s 
p(F,GW/dF7dG  dG 

■(1/  2ito fOq  );exp[-[(x-mF)/2aF]2“[(x-mG)/2oGJ2]dx 
completing  squares  in  the  integrand, 

-(oFoG^2  //j2,+o2  ) exp [ -o2<r2 <mF-mG) 2 /(o|+a|) 2 ] 
and  substituting  we  obtain: 

d(F,G)-(2[l-p(F,G)])1/2 

-Si  [l“(oFoG/2  //o2+o2  )exp[-o|o2(mF-raG)2/(a|+o^)2]] 
which  shows  just  how  complicated  conceptually  simple  results  can 
become.  As  a  next  step,  these  formulae  should  be  extended  to 
multi-dimensional  Gaussian  distributions,  but  we  will  not  do  it 
here. 


2.7  Performance  Criteria  and  Linear  Risk 

No  sooner  are  we  given  a  class  C  of  models  then  we  want  to 
choose  one  of  the  class  to  analyze  or  perhaps  to  build.  The 
easiest  way  to  accomplish  this  is  to  define  a  real  valued  function 
n  on  C.  The  set-valued  right  inverse  R-^  then  images  the  total 
ordering  <  on  into  a  total  ordering  <<  defined  on  C.  Either 
the  lub  or  gib  of  <<  should  exist  in  C  so  that  this  element  can 
be  the  one  chosen.  Since  changing  JH— II  switches  the  lub  and  gib 
of  the  induced  ordering  <<,  we  will  be  interested  in  showing  that 
II  is  bounded  either  above  or  below,  but  not  necessarily  both. 


We  will  only  discuss  a  single  class  of  performance  criteria, 
but  they  are  useful  for  simple  detection  problems  since  they 
take  ^L+l"5!  rather  than  some  combination  of  the  intermediate  spaces 
in  which  case  the  model  would  be  set  up  for  estimation  problems 
or  mixed  detection/estimation  problems.  We  arrive  at  our  class 
of  criteria  by  assuming  a  linear  relationship  between  the  various 
detection  alternatives  and  the  total  benefit  produced  by  the  model. 

Let  C(a,b)  be  the  benefit  derived  when  atS^  is  transmitted  and 
bcS^^^S^  is  detected.  The  expected  benefit  is  just 
nc(p)»/C(a,b)p|L+1(db)u1(da) 

where  the  integration  is  over  an<^  t^e  inte8rnl  exists 

whenever  C(a,b)  is  measurable.  When  C(a,b)  is  bounded, II  is  too. 

This  Baysian  performance  criterion  can  also  be  written  in  an 
expanded  form  as 

nc(vWc(a,b)xa(t-1db)p(dadt) 

1IC  should  be  insensitive  to  changes  in  y  since  u  can  never  be 
known  exactly.  At  the  least  this  should  mean  continuity  of  nc  with 
respect  to  some  topology  on  the  space  of  which  y  is  a  member.  One 
of  the  merits  of  a  linear  risk  (or  benefit)  performance  criterion 
is  precisely  this  continuity  in  both  of  the  spaces  P  and  P': 

THEOREM  8  If  |C(a,b)]  is  bounded  by  c  then  is  continuous  in  P. 
PROOF  |n(y)-n(v)  =|/C(a,b)xa(t“1db) (u-v) (dadt) 

<e,c/x  (t-1db)  j y-v  ]  (dadt) 

“  c/jy-vj (dadt) 

-  cdp(y,v) 

where  dp(*,*)  is  the  metric  in  P. 


D-39 


COROLLARY  1 


If  is  countable  then  ^L+l^  continuous  in  P, 
COROLLARY  2  If  C(a,b)  is  bounded  then  n  is  continuous  in  P'. 

c 

PROOF  Let  p'-+p'  in  P'.  We  wish  to  show  that  H^(pQ)'+lIt,(p)  where 

Pn“/p'  is  in  P.  But  since  p'**p'  in  P"  implies  that 
pQ-*p  in  P  (Lemma  1,  section  2.o),  this  follows  immediately 


2.8  Expected  Error  and  Singularity 

An  important  special  case  of  linear  benefit  arises  when 
C(a,b)*=d^(a,b)  where  d^(*,*)  is  the  metric  in  the  spaces  and 
S2L+1*  The  resultin8  performance  criterion,  nd^,  or  simply  »  is 
the  expected  error.  If  11^ Cm)  Is  zero,  the  model 
M"  <S1,S2,...,  ^2^.^  iU) 

is  said  to  be  singular.  If  y^(a)>0  then  a  is  called  a  naturally 
occurring  source  character;  it  Is  clearly  no  restriction  to 
consider  only  naturally  occurring  source  characters.  The  following 
theorem  and  corollary  are  basic  to  an  understanding  of  model 
singularity. 


THEOREM  9  If  S1“S2L+1  are  counta^le  with  a  discrete  metric  then 

M  Is  singular  iff  M2L+l^r^H^  ^or  every  naturally  occurring 
rcS^. 

PROOF  To  establish  necessity,  assume  that  M  is  singular.  Then 

a 


0"nd(yWd(a,b)u2L+i(db)  ^(da) 

-  I  d(aitbj) V2L+l^bj^Pl^ai^ 


D-40 


and  since  each  term  in  Che  sum  is  positive,  each  term 
must  be  zero  separately: 

d^ai’bj^lJ2£+l<“bj^,1l^ai^“0  for  a11  ±,:* 
from  which  the  necessity  follows,  and,  so  does  the 

sufficiency. 


COROLLARY  1 


If  s^”S2l+i  are  countable  with  a  discrete  metric 
then  M  is  singular  iff  ^L+l-^L+l  ^or  evcry  naturally 
occurring  pair  rj*s  in  S^. 


This  corollary  expresses  the  well  known,  but  usually  imprecisely 
stated,  fact  that  model  singularity  is  equivalent  to  measure 
orthogonality.  An  obvious  extension  of  this  theorem  is  provided  by: 


THEOREM  10  If  M  factors  then  V^l+I  J-  ^2L+1  ^or  3  natura '  ly  occurring 
pair  r^s  iff  1  p|k+1  for  a11  k<L* 

PROOF  We  will  establish  necessity  for  k=L-l  by  showing  the 

IT  *•  A 

contrapositive.  Letting  Qk=supp(u2k+^)  »  this  means 
assuming  either  ^k+l^k^0  or  ^k+l^k^0  and  showing 
that  the  same  holds  for  either  F^L+l  or  W2L+1’  respectively. 
First  note  that 

v2L+l(R)e/*xr(-rlR>»J2L<di) 

SL 


0  This  and  subsequent  proofs  have  been  simplified  by  suppressing 
references  to  "some"  support  and  talking  instead  about  a  (reads  like 
the)  support  of  a  measure.  No  error  has  been  introduced  by  doing 
this,  while  the  arguments  have  become  easier  to  follow. 


D-41 


LEMMA  1 


PROOF 


or,  since  M  factors  and  2L-l*2k+l 

u2L+l^R't“  i/-  xa^C2LR^2L^dt2L^2L-l^da^ 

QkxS2L 

and  a  similar  expression  holds  for  p|L+^(R). 

Now  assume  that  l^k+l^k^^  and  ^°°k  at: 

U2L+1*Q1^“  r£-  xa(t2LQl)‘J2L(dt2L)u2L-l(da^ 
QkxS2L 

“  /  V2k+l(t2LQL)lJ2Lfdt2L) 

S2L 

"2k+lWk  "  t2K)*,2LWt2L) 

S2L 


which  by  Che  lemma  below,  is  greater  than  zero.  This 
establishes  necessity  and  hence  the  theorem,  since  the 
sufficiency  is  immediate  from  the  definitions. 


If  w2k-U<Qk)>0  then 


n  '2K<v 

S2L 


1“  V  2L+1 ( QL5  “ /  V  2k+l  ( 1  2lV  y  2L  (  d  C 2L5 
b2L 

“s/W2k+l(Qk  n  t2LQL)v,2L(dt2L) 


2L 


hence  t^Q^  =  Q£  for  all  t2Lesupp(u2L> .  But,  stronger 
than  that,  this  says  that  Qa  is  covered  by  the  inverse 
maps  of  Q®,  or,  precisely. 


C2LMupPtBa) 


'it"’!*  s  Qk 


BuC  chat  means  that  if  N^k+l^k^®  then  likewise 


°</"L+l<'>k  "  t^<)”2l(dt2L> 
S2L 


D-42 


This  theorem  says  that  the  conditional  measures  must  start 
out  orthogonal  and  stay  orthogonal  if  the  model  is  singular, 
and  vice-versa.  But,  this  holds  only  for  factorable  models,  where 
the  marginal  measures  are  all  independent.  A  non- factorable  model 
can  be  rigged,  as  in  the  example  of  the  next  section,  which  is 
still  singular  even  though  some  of  the  intermediate  conditional 
measures  are  not.  The  converse  counter-example  which  corresponds 
to  this  example  is  easily  constructed,  so  it  will  not  be  explicitly 
given. 


2,9  Singularity  in  a  Non-factorable  Model 

Our  example  will  illustrate  the  necessity  of  the  factorability 
assumption  in  Theorem  10,  section  2.8.  Take  as  the  model 
M=(S^,S2,S^,S^,S^,a* v) 

where  a  is  defined  on  and  v  is  defined  on  S2xS^,  and 
Sl“S3'=S5={0'1} 

S2*5V,{0’1}x{0’1) 


:  a  is 

given 

xeS^ 

a(x) 

1 

1/2 

0 

1/2 

S2  and  S^,  are  both  spaces  of  operators,  and  in  order  to  give 
v  on  S^xS^,  we  will  represent  pairs  of  operators  (b,d)eS2*S/  by 
enumeration  of  the  operator  values  on  all  of  the  points  in  SjxS^. 

To  define  one  pair  (b,d)  we  only  need  to  give  four  valuer,  since  S^, 
the  domain  of  bcS^,  and  S^,  the  domain  of  d^S^,  each  consist  of  only 


D-43 


This  shows  clearly  that  in  fact,  they  are  equal.  However, 


clearly  showing  that 

This,  then,  is  a  non-factorable  2  stage  model.  The  marginal 
measures  after  the  first  stage  are  equivalent,  while  the  marginal 
measures  after  both  stages  are  orthogonal,  making  the  model  singular. 
This  happens  because  the  two  stages  are  not  independent;  the  model 
does  not  factor.  As  a  result,  the  second  stage  can  be  arranged 
(and  is  in  this  example)  so  as  to  undo  the  random  selection 
introduced  by  the  first  stage.  Fortunately,  this  kind  of  dependence, 
correlation  of  1,  is  not  .  ven  a  plausible  representation  of  reality, 
and  so  would  never  be  used  in  an  actual  C/D  model. 


D— 15 


3. 


LINEAR  TRANSMISSION  AND  ADDITIVE  NOISE 


In  chapter  2  the  outlines  of  a  general  theory  of  communication 
and  detection  models  were  established.  In  this  chapter  that  theory 
will  be  specialized  in  order  to  obtain  certain  results  for  models 
of  additive  noise  and  linear  transmission.  The  assumption  of 
additive  noise  is  widely  made  because  of  the  mathematical  tractability 
which  it  provides. ..and,  once  additive  noise  is  assumed,  the 
assumption  of  linear  transmission  often  follows. 

A  special  "multiplicative"  stage  is  also  defined  and  analyzed. 
Although  this  type  of  stage  is  rather  simple,  it  is  included  here 
for  reference  in  chapter  4,  where  these  three  kinds  of  stages, 
multiplicative,  additive,  and  linear,  will  be  used  in  the  synthesis 
of  sonar  models.  The  theorems  in  this  chapter  will  then  make  it 
possible  to  expose  and  understand  a  certain  kind  of  (additive) 
singularity  that  can  arise  in  these  models. 

3.1  Additive  Stages 

When  the  effect  of  a  stage  of  a  C/D  model  is  to  add  two  linear 
6ubspaces  together,  as  in  stage  2  of  the  examples  in  section  2.4, 
a  very  important  kind  of  singularity  of  the  model  can  occur. 

Following  Figure  3.1,  suppose  S  a  .d  N  are  subspaces  of  a  linear 
space,  S  n  N  the  subspace  they  have  in  common  and  S+N  their  sum. 

If  pg  is  supported  by  S-{0),  (i.e.,  by  S  exclusive  of  the  origin) 
tig(0)**l  and  supp(un)cN,  we  can  consider  two  measures,  and 

U0+N  on  C^c  sum  6pace  S+N  are  induced  by  addition  of  the 


D-46 


Figure  3.1 

Possible  Singularity  in  Additive  Stages 


two  spaces.  We  see  that  and  that  is  supported  by 

U-rN  N  S+N 

that  portion  of  S+N  which  is  above  N  .if  SnN=0.  The  result  is  that 

v_.„  1  u.,,  and  the  reason  is  that  supp(u  )  has  a  linear  projection 
S+N  N  s+N 

outside  of  the  subspace  spanned  by  suppCp^).  This  last  phrase 
will  turn  up  again  in  a  stronger  form  when  we  consider  Gaussian 
processes  a  little  later  in  this  section. 


D-47 


We  say  that  the  k-th  stage, 
is  additive  if: 


(S2k-l»S2k‘S2k+l)  °f  a  C/D  model 


1.  S2k+1  a  Banac^  sPace* 

2.  S*.  ,  is  a  subspace  of  S„. 

2K-1  2k*rl 

3.  Sjj^b^jheQ}  where  Q  is  a  subspace  of  and 

bhsS2k-rS2k-l+h  c  S2k+1 


For  simplicity  of  nomenclature,  we  will  use  to  mean  both 
the  space  of  operators  and  the  subspace  Q  of  S  since  the  map 

*i'(bjj)n,h  is  an  isomorphism.  Giving  the  norm  1  |bjJ  I*!  |h|  |  makes 
¥  an  isometry  as  well. 

As  defined  here,  additive  stages  have  an  important  property: 
when  S^k  n  S  ^*0,  that  is,  when  they  have  only  the  origin  in 
common,  singularity  is  preserved  by  the  k-th  stage.  In  an  abuse 
of  language  thav  is  unlik* ly  to  cause  confusion,  a  stage  which 
preserves  singularity  is  itself  said  to  be  singular.  If  we  let 
be  the  first  k  stages  of  a  C/D  model  M,  we  can  state  this  as 
a  theorem: 


THEOREM  1  If  Mjc_i  is  singular  and  stage  k  is  additive  with 

S  nS  “0,  then  is  singular,  i.e.,  stage  k  is 
2k'- 1 

singular. 

PROOF  Singularity  of  M^  implies  the  existence  of  disjoint 

supports  for  the  conditional  measures:  Qj^^n  Q*  -{ } * 

.  for  all  sfr.  Since  we  also  have  S  .  r>S„,  ,*=0,  we  have 

2k  2k- 1 

u  b(Qf  ■,)  n  u  b(Qr  ,)-{}  for  all  s^r 
beS2k  hcS2k 


D-48 


But, 


u 

beS 


b(Q 

2k 


B 

k-1 


>  »  Q 


B 

k 


so 

Q®  n  Qr  *{},  for  all  Si*r 
k  k 

and  is  singular,  (remember  that  is  defined  to  be 
the  support  of  u^k+l*  8ee  tliaoretn  2-.  10) 


This  theorem  is  useful  whenever  additive  noise  is  used  in  a 
model,  as  it  is  in  the  examples  of  section  2. A  and  as  it  is  in  the 
sonar  models  of  sections  4.4  and  4.5.  Having  those  sufficient 
conditions  for  an  additive  stage  to  singular  whets  the  appetite 
and  motivates  a  search  for  necessary  conditions.  As  a  step  in 
that  direction: 


PROOF 


THEOREM  2  If  ^  is  singular,  stage  k  is  such  that  suppCp^JoS^ 
and  pjit  is  independent  of  p^  for  all  i<2k  then  is 
non-singular. 

y2k+l(Qk>V  '  XaCt2iQk)y2k(dt2k)u2k-l(da) 

VlXS2k 

and  by  the  assumed  property  of  stage  k, 
supp(p  )  n  supp(u  .)f{) 

4K 

s  r 

so  that  t2kQk  n  Qkl^{ )  for  all  t  in  some  set  T  of 


positive  p,,  measure  in  S 

‘k  2k 

is  non-singular . 


Hence  V2k+l^k^>0  and  \ 


Notice  that  this  proof  really  says  nothing  about  the  singularity 
of  so  the  theorem  holds  even  if  is  non-singular. 


D-49 


Gaussian  Measures 


When  Gaussian  measures  are  assumed  in  anything,  stronger 
results  can  usually  be  expected.  In  the  Gaussian  detection  models, 
in  fact,  must  either  be  singular  or  the  conditional  measures  must 
be  equivalent: 

THEOREM  3  If  v  and  y  are  Gaussian,  either  pj_v  or  y'vv. 

A  proof  in  the  general  case  has  been  given  by  J.  Feldman  [1]. 

For  further  discussion  of  related  work,  see  section  1.3. 

Various  ways  of  telling  whether  yj_v  or  u'ov  have  been  discovered. 
They  fall,  generally,  into  two  groups:  1}  general,  universally 
applicable  and  non-constructive,  hence  useless  in  practical  analysis. 
Feldman's  original  proofs  are  of  this  nature;  2)  constructive, 
but  applicable  only  to  special  kinds  of  normal  distributions,  such 
as  Markov  processes,  or,  stationary  processes  at  least  one  of 
which  has  a  rational'  spectral  density.  What  is  perhaps  the  most 
useful  of  the  universal  results  was  obtained  by  Kallianpur  and 
Oodaira  [  1] : 

THEOREM  4  P-vQ  iff 

1/  m(- )cH(rq) 

?./  r  has  a  representation  rp(s,t)“Iyjce^(s)ejc(t) 
where  {ejJ  is  a  c.o.n.  in  HCT^)  and  I(l-yjt)2<oo 
and  yjt>«c>0  for  all  k 


D-50 


In  this  theorem,  R(r)  is  the  reproducing  kernel  Uilbert  space 
(Aronsajn  [1][2])  with  kernel  F  and  P,Q  are  the  distributions  for 
the  normal  processes  with  correlations  Fp.F^  and  mean  functions 
m(*)  and  0  respectively. 

As  an  immediate  application  to  additive  stages,  we  have: 


COROLLARY  1  If  P  X  Q  because  m(*)f!H(ro)  then  P+Q  _[  Q 


This  can  be  applied  to  example  number  1,  section  2. A,  to 
show  that  the  second  stage,  which  is  additive,  is  singular  if 
p^  P^  because  the  mean  of  p^  is  not  in  H(r^) .  As  an  application 
of  the  second  condition  of  Theorem  3  we  have 


COROLLARY  2  If  P  _L  Q  because  p^+O  or  some  p^-0  then  P  P+Q 

This  is  the  situation  that  occurs  when  the  signal,  Q,  occupies 
some  dimensions  ("bandwidth")  that  the  noise  does  not.  While  this 
can  not  happen  in  a  practical  sense,  it  can  plague  model  builders. 
More  about  this  in  Chapter  A.  Finally,  as  a  converse  to  Corollary  2 

COROLLARY  3  If  P  Q  because  p^O  or  some  then  P+Q'tQ, 

This  is  the  normal  situation:  the  noise,  Q  is  wider-band 
than  the  signal,  P,  so  the  additive  noise  stage  Is  non-singular. 

Thinking  in  terms  of  linear  spaces.  Theorem  A  says  to  first 
look  at  the  linear  space  most  closely  tied  to  the  process  (with 
distribution)  Q.  This  is  just  H(Tq),  the  RKHS  (Reproducing  Kernel 


D-51 


Hilbert  Space)  with  kernel  r^.  H(r^)  is  »  natural  space  for  Q 

because  A)  it  supports  Q  and  B)  distances  reflect  the  concentration 
of  Q  along  various  axes.  Now,  given  another  Gaussian  process  P 
with  Bean  m(*),  in  order  to  see  if  P  j_  Q  ve  have  to  do  several  things. 
The  first  is  to  check  that  n(*)cH(rQ>.  If  it  is  not,  then  P  is 
lifted  up  out  oi  H(Iq),  that  is,  has  a  linear  projection  outside 
of  the  support  of  Q,  so,  P  1  Q.  If  m(*)eH(r^)  then  we  still  have  to 
check  further.  To  begin  with,  r p  nay  not  be  representable  in 
H(rq)xH(rQ) ,  that  is,  Tp  may  have  a  linear  projection  outside  of 
H<r  )*.  The  support  of  P  will  then  too,  so  that  P  Q.  Secondly, 
some  nay  be  zero.  That  is,  H(Tq)  may  have  linear  dimensions 
not  needed  to  represent  Tp,  in  which  case  supp(Q)  will  have  linear 
dimensions  outside  of  supp(P),  so  that  again  P  Q.  Finally,  we 
are  asked  to  look  at  the  distribution  of  "energy"  into  the  different 
"eigenf requcncies"  of  the  two  processes.  Unless  the  two  put  almost 
the  same  energy  on  all  but  a  finite  number  of  dimensions,  i.e., 
unless  f  Cl-Uj,)2<co  ,  then  P  Q,  While  this  last  requirement  has 
the  only  probabilistic  flavor  of  all  the  requirements  for  P^Q, 
even  it  is  closely  related  to  the  concept  of  projections  ouside  of 
a  given  linear  space.  What  it  says  is  that  supp(P)  and  supp(Q)  may 
not  sneak  linear  projections  outside  of  each  other  through  divergent 
behavior  at  infinity  if  P^Q  is  to  hold. 


D-52 


3.2  Linear  Stages 


When  ^  and  S2k+1  are  Banach  spaces  and  all  of  the  elements 
in  are  linear  operators  (additive  and  continuous)  from  S2k_^ 


into  t*ien  stage  k  is  linear.  As  one  expects,  linear  stages 

are  also  capable  of  preserving  singularity,  or  of  being  singular 
to  use  the  verbal  shorthand  introduced  in  section  3.1.  The  simplest 
case  arises  when  u2k  is  degenerate  at  t,  in  which  case  the  effect 
of  the  stage  depends  entirely  upon  the  nullspace  of  t,  call  it  Tq. 
Letting  be  the  subspace  spanned  by  q£,  we  have: 

THEOREM  5  If  is  a  Hilbert  space,  is  singular  and 

-L  T0  for  all  hut  one  rcS^  then  is  singular. 

PROOF  t(Q  and  tCQ^  n  tCQ^)  *■  0  since  t  is  1:1 

°n  S2k-1~T0 ‘ 

Extension  of  this  result  to  Banach  spaces  requires  the  assump¬ 
tion  that  projections  Pq  onto  Tq  and  P^I-Pq  exist,  since  the 
existence  of  a  projection  onto  an  arbitrary  subspace  of  a  Banach 
is  not  guaranteed.  If  does  exist,  we  have: 

THEOREM  6  If  is  singular  and  P^Q^  ^  for  ahl  hut  one 

then  is  singular. 

Either  of  these  theorems  has  an  obvious  extension  to  multiple 
operators  via  the  simple  expedient  of  defining: 

TQn{union  of  all  nullspaces  of  operators  in  supp(u2jt)l 


D-53 


Under  this  definition,  may  not  be  unique,  ao  the  extended 
theorem  must  be  phrased  ass  1b  singular  if  is  singular 

and  there  exists  at  least  one  Tq  such  that  for  all 

but  one  seS^. 


3.3  Multiplicative  Stages 

The  k-th  stage  of  a  model  is  said  to  be  source  multiplicative 
if: 


1.  ^2k+l  8  Banac^  sPace 

2.  Is  isomorphic  to  s2k+i  the  usual  way»  see  section 
3.1,  and  eheS2k  maps  aeS^  ^  into  a*htS21c+1. 

3.  S2k-1  is  the  ir0m  which  S2k+1  is  generated, 

either  C*  or  r\ 


The  following  almost  trivial  theorem  on  singularity  of  a 
source  multiplicative  stage,  finds  application  to  the  first,  or 
encoding  stages  of  models  in  sections  4.4  and  4.5. 


THEOREM  7 


PROOF 


If  is  source  multiplicative,  p^  has  no  atomic  part, 
S^K{0,1},  p^  J_  p^  and  p^  is  completely  degenerate  at  0, 
then  is  singular. 

All  of  the  mass  of  p^  is  collapsed  onto  zero  when 
multiplying  by  zero,  so  that  p^  is  degenerate  at  zero 
with  mass  1  there,  while  does  not  have  an  atomic  part 
at  the  origin. 


O— j4 


Just  to  be  different  we  could  let  rather  than  be 

the  field  from  which  S2k+1  8enerated*  We  say  that  the  k-th 
stage  is  simple  operator  multiplicative  if: 

1.  S2k.+1  is  a  Banach  space 


2. 


S2k-1  is  S2k+1 


3.  S2k  is  isomorphic  to  the  field  ot  ^2k+i  attd  aeS2k  1’ 
beS2k  then  b(a)»=b'acS2k+1 

This  kind  of  stage  models  a  simple  fading}  the  amplitude  of 
the  signal  is  a  random  variable.  Another  kind  of  ''multiplication", 
in  which  the  space  of  operators  has  a  group  structure,  is  a  delay 
stage: 

S>2k+1  is  a  space  of  functions ,  Each  function  has  the 
real  line  as  at  least  one  argument,  say  the  first. 


1. 


2. 

3. 


S2k~l  ls  S2k+1 


>2k 


is  a  group  of  translation  operators?  if  acS 


2k- 1 


and  then  Th^aC*^,  . .  .J-^aCt-j+h, . . .) 

These  two  last  types  can  be  combined  to  give  a  gross  model 
of  multi-path  transmission  effects.  If  aeSj^  ^  and  TcS^  t*ien 
T(1>,h)  (a)“lb^a(t+h^) 

A  natural  question  to  ask  is,  what  kind  of  topology  can  have 
and  still  induce  a  measure  on  S  B^mP^e  °Parat°r  multi¬ 

plication,  being  a  subspace  of  the  space  of  linear  operators, 
presents  no  problem.  What  about  the  delay  stage?  It, too,  is  a 
linear  operation,  but  now  the  "natural"  norm,  llT^||"|h|  does  not 
always  induce  a  suitable  topology:  the  functions  in  must 

be  continuous  first. 


n-55 


4. 


APPLICATIONS  TO  SONAR 


Models  of  sonar  systems  must  include  descriptions  of  the  process 
of  sound  transmission,  either  explicitly  or  implicitly.  Often  these 
descriptions  begin  by  assuming  not  only  that  the  scalar  wave  equa¬ 
tion  is  applicable,  but  that  a  particular  solution  of  the  wave 
equation  can  be  used,  (for  example,  incident  plane  waves,  Bryn  [1]) 
While  this  approach  avoids  a  great  deal  of  complexity,  the  assump¬ 
tions  involved  may  create  serious  problems  (e.g.,  singular  models) 
whose  origins  have  been  obscured  by  the  lack  of  explicit  detail  in 
the  initial  model  building. 

One  of  the  concerns  of  this  chapter  is  to  identify  and  under¬ 
stand  all  of  the  assumptions  that  are  made  in  the  sonar  models  to 
be  used  here,  so  the  chapter  starts  with  a  discussion  of  the  basic 
physics  of  sound  transmission  (this  follows  Sokolnikoff  [1]  quite 
closely)  .  Derivation  of  the  scalar  wave  equation  requires  several 
major  assumptions  which  are  identified  for  discussion  in  section  4.6. 

Sections  4.2  and  4.3  are  primarily  concerned  with  the  kernel 
of  the  transmission  operator  defined  implicitly  by  solution  of 
the  inhomogeneous  wave  equation  in  2  and  4  variables,  respectively. 
The  kernel  of  the  2  variable  operator  can  be  characterized  nicely, 
chiefly  because  the  solution  of  partial  differential  equations 
in  two  variables  can  be  reduced  to  the  solution  of  ordinary  dif¬ 
ferential  equations  along  characteristics.  Consideration  of  the 
4  variable  transmission  . ~erator  is  much  more  difficult;  it  has 
only  been  possiMe  to  triune  a  start  here. 


D-a6 


Sections  4.4  and  4.5  are  devoted  to  sonar  problems;  the  theory 
previously  developed  la  applied  to  the  construction  and  analysis  of 
sonar  models  in  1  and  3  spatial  dimensions.  Finally,  section  4.6 
contains  a  summary  of  the  work  in  the  area  of  singularity  of  sonar 
models  and  suggestions  on  the  construction  of  non-singular  models. 

Throughout  section  4.1,  especially,  use  ia  made  of  the  Einstein 
summation  convention:  whenever  a  subscript  occurs  on  the  right  side 
of  an  equation  that  does  not  appear  on  the  left,  a  summation  over 
all  values  of  that  index  is  implied.  The  range  of  the  indices  is 
from  1  to  the  dimension  of  the  space  in  which  one  is  working, 
normally  2  to  4. 

4.1  The  Physics  of  Sound  Transmission 

Underwater  sound  transmission  is  treated  here  as  a  special 
case  of  wave  propagation  in  a  continuous  medium.  As  mentioned 
above,  we  want  to  derive  the  differential  equation  which  governs 
sound  transmission  in  order  to  point  out  all  of  the  physical  assump¬ 
tions  that  are  implicit  in  the  use  of  the  scalar  wave  equation  to 
model  sound  propagation.  Since  we  want  to  uncover  assumptions, 
we  are  forced  to  begin  at  a  general  level: 

Strains 

Consider  two  states  of  the  same  material  body: 


Initial  State 

Deformed  State 

Spatial  Region 

To 

T 

Reference  Frame 

,Y° 

X 

Coordinates  of  a 

y 

x* 

point  P 

Afl  our  first  assuuotion,  (a  physically  reasonable  one  though), 
let  the  deformation  of  into  T  be  of  class  and  1:1  so  that 
the  point  transform ation 

(1y,2y,^y,t) 

has  an  inverse 

S-1y(*1.*2.*3,t) 

for  all  values  of  the  deformation  parameter  t,  and  the  derivatives 

ax^/s^y  a*y/ax3 

exist  and  are  continuous. 

2  2 

If  dsQ  and  ds  are  the  initial  and  deformed  lengths  of  an 
infinitesimal  arc,  their  difference  represents  the  strain  produced 
in  the  medium  by  the  deformation.  Restricting  ourselves  to  rec¬ 
tangular  Cartesian  coordinates  alone,  we  can  write: 
dsD^“d^yd1y 
ds^  «dx*dx*' 


so  that 


ds^ds^-OxVa^  *  ax’Va^y  -ij6)d1ydJy 
"(S^j-a^y/ax^  •  aky/ax-^)dx^dxJ 


or. 


ds2-dso2«2ljn (y , t) d1yd^y 
(x, tjdx^dx^ 

where  is  the  Langrangian  strain  tensor  and  is  the  Eulerian 
strain  tensor. 

12  3 

Letting  {,*■({,  ,C  C  )  be  the  displacement  vector,  we  can  write: 
Li(y,t)-xl(y,t)  -  ly 
^(x.O-x1  -  ^-yCx.t) 


D-58 


t 


Differentiating  and  substituting  for  3x*/3^y  and  3*y/3x^,  we  see  that 
2ljn(y,t)-Hi/3jy  +3^/3^  +  3^/3^  •  3Ck/3^y 
?,ci^(x,t)-3C1/3x;5  3^/3x*  -  S^/Sx1  .  a^k/3xJ 

By  assumning  infinitesimal  strains,  we  can  drop  the  product  terms 
and  also  disregard  the  differences  between  the  Initial  and  deformed 
coordinates  since 

3x1/3:1y=6^  +  3C1/3^y  =  6* 

Having  done  this,  we  find 

ijn“Eij“1/2  *  (3^i/3x;!  +  3^/3xi) 

■  “C6i,j  *  5j,i)/2  (E4.1) 

In  this  linear  theory,  e^j  is  symmetric. 

This  set  of  assumptions  is  unreasonable  in  general,  but  in 
passive  sonar  problems  the  particle  displacements  are  so  very  small 
as  to  justify  their  adoption. 

Stress 

Stress,  the  force  per  unit  of  area,  is  also  characterized  by 
a  symmetric  tensor,  the  stress  tensor  x^.  There  are  no  assumptions 
hidden  under  the  tensor  cover. 

Equation  oi  Equilibrium 

Consider  a  body  T  which  is  in  equilibrium  under  surface  forces 

given  by  the  stress  tensor  x^  and  volume  forces  given  by  the  force 

per  unit  of  volume  F i.  Letting  Xj,  be  a  fixed  unit  vector  and  xij  be 

the  unit  surface  normal,  the  assumed  equilibrium  is  expressed  by: 

/F*-X,dx  +  /x^A.n.do  “0 
V  S  J 


D-59 


for  every  subregion  V  of  T  bounded  by  the  surface  S. 
divergence  theorem  and  noting  that  ^  ia  a  constant, 
at  every  point  in  T 

F1  +  r*^  -0 

» j 


Applying  the 
we  s«*that. 


CE4.2) 


Equations  of  Motion 

Using  D’Alembert's  principle,  we  add  the  inertial  force  -pa* 
to  obtain  the  equations  of  motion: 

F1  +  -  pa1  -  0  (E4.3) 

*  J 

where  p  is  the  density  and  a*  is  the  acceleration 


Stress-Strain  Relationships 

An  additional  basic  assumption  is  that  stress  and  strain  are 


linearly  related: 

KIT* 

Of  the  81  components  of  the  tensor  c^,  27  are  eliminated  by  the 
symmetry  of  t ,  and  18  more  by  the  symmetry  (in  the  linear  theory) 


of  e 


km 


Since  the  strain  energy  per  unit  volume  is 


6W  “ 


ij  km. 

lj  "  ckme  6cij 


,,  ij  km  , . 
W  “ckmc  cij/2 


km  ij  /- 
cijEkme 


we  see  that  "  cij'’  eliminates  15  more  components. 


Isotropic  Medium 

If  the  medium  is  isotropic  then  more  components  can  be  elim¬ 
inated: 

a)  3  interchanges  of  axes:  18  components  eliminated. 


D-60 


b)  Together  with  reversal  of  the  sense  of  an  axis,  a)  shows 
that  all  but  three  independent  coefficients  are  zero. 

c)  Invariance  under  rotation  shows,  finally,  that 

12  11  11 

r  m  e  r-  p 
12  11  22 

The  remaining  components  can  be  taken  in  the  form: 

2u 
X 

X+2p 

from  which  we  obtain  the  isotropic  relationship: 
i  ,  k.i  ,  -  i 

The  assumption  of  isotropy  is  a  good  one  in  sonar  work. 


12 

13 

23 

21 

3l 

32 

12“ 

C13** 

c23“ 

c2l“ 

c31*c 

32' 

11 

11 

22 

22 

33 

33 

to 

22 

c3r 

c33“ 

C11C 

C11EC 

i 

22 

11 

22 

33 

11" 

C22** 

c33“ 

(E4.4) 


Isotropic  Equations  of  Motion 

Putting  E4.4  into  E4.3,  and  using  E4.1,  we  obtain,  using 
h' Alembert's  principle  again 

U+uKl,\i  +  UgJktt.Jv.  -  pq  (E4.5) 

No  assumptions  about  X,  y  or  p  have  been  made.  All  three  could 

Ik 

be  functions  of  position.  The  metric  tensor,  gJ  ,  would  become 

Ik 

the  Kroneckcr  delta  6J  ,  in  Cartesian  coordinates  of  course. 

Perfect  Fluid 

A  perfect  fluid  is  one  in  which  p“0.  If  we  define  the  dilation, 

I,  to  be  the  particle  divergence, 

I-  -  v-c 

then  we  have 


as  the  equation  of  motion  for  an  isotropic  perfect  fluid. 


D-bl 


To  work  the  dilation  into  this  equation,  differentiate  with  respect 
to  x1  and  sum  on  1  to  obtain 
XV2X  -  pi 

where  we  have  assumed  that  variations  in  X  and  p  are  small  compared 
with  those  of  1.  Unless  X  and  p  are  taken  to  be  constants,  a  theory 
based  on  this  description  of  sound  propagation  would  not  be  valid 
for  arbitrarily  low  frequency  waves.  The  assumption  that  water  is 
a  perfect  fluid  is  more  innocuous  in  sonar  work;  the  effect  of  this 
assumption  is  to  make  it  impossible  for  the  model  of  the  medium  to 
support  shear  waves.  These  do  exist  in  water,  but  the  low  viscosity 
limits  their  range  so  greatly  that  they  can  be  safely  ignored. 

Since  the  stress  tensor  for  a  perfect  fluid  is 

Tj  *  U6j 

we  can  define  p*=x:£  and  k=-X  in  order  to  obtain 
p~-kl 

and  if  c2*=-k/p  then 

c2V2p  “  p  (E4.6) 

That  is,  within  the  limitations  of  our  assumptions,  (the  important 
ones  being:  very  small  particle  displacements;  a  linear  stress- 
strain  relationship;  nearly  constant  bulk  modulus  and  density, 
yielding  a  nearly  constant  speed  of  sound)  sound  propagation  obeys 
the  scalar  wave  equation. 

It  would  be  very  interesting  to  develop  the  following  sonar 
models  with  these  assumptions  weakened  or  eliminated,  but  it  does 
not  seem  possible  to  do  this. 


D-62 


•y 

111 


4.2  The  Two  Variable  Wave  Equation 

The  scalar  wave  equation, 

V2p  -  J5/c  (E4.6) 

describes  the  propagation  of  sound  when  wave  amplitudes  are  small 
and  c,  the  speed  of  cound,  is  nearly  constant.  While  this  classic 
differential  equation  has  received  the  attention  of  many  minds  for 
well  over  100  years,  it6  properties  are  still  not  completely  known 
and  cataloged.  (Courant  [1]) 

In  this  section,  some  interesting  properties  of  the  equation 
in  two  variables,,  time  and  one  spatial  dimension,  are  developed, 
using  very  elementary  methods,  based  on  the  properties  of  charac¬ 
teristics  for  2  variable  differential  equations.  In  section  4.3, 
following,  similar  results  are  sought  for  4  variables,  but  here 
a  variety  of  methods  must  be  employed,  none  with  complete  success. 

The  results  sought  are  es  -entially  concerned  with  the  size  of  the 
null  space  of  the  operator  defined  by  (E4.6)  when  operating  on 
Cauchy  initial  data,  with  and  without  inhomogeneous  terms. 

Consider  the  differential  operator// 

W[p]  -  c2(x,t)pxx  -ptt  fE4 .7) 

2 

For  scalar  wave  phenomena,  c  is  positive  so  that  w  is  everywhere 
hyperbolic.  The  characteristic  curves  for  W,  those  along  which  W 


$  The  prerequisites  for  this  discussion  can  be  found  in  any 
standard  text,  e.g.  Courant  [1]  or  Horse  [1],  Note  that  independent 
variables,  x,y,z,t,  used  as  subscripts  denote  differentiation: 

3f  ( •  ) /'dx=fx(  • )  .  Other  subscripts,  such  as  n,k  etc.,  do  not  denote 
differentiation,  but  serve  merely  as  auxiliary  arguments  to  functions. 


D-63 


la  an  interior  operator,  are  defined  by  the  equation 

c\2  -  *t2-  0 

where  <j>(x,t)“0  defines  a  curve  in  the  (x,t)  plane  which  we  assume 
to  be  regular,  i.e.,  and  may  not  both  vanish  simultaneously. 

We  see  that  c$x  -  on  the  characteristics,  or,  for  constant 
c,  $  «  x  ±  ct  4  xQ,  Characteristics  are  also  rays,  cr,  the  direc¬ 
tions  of  propagation  of  wavefronts  (Figure  4.1). 

Now  consider  a  simple  radiation  problem  for  constant  ct 
Vtp]  -  -c*St(t)6(x«a) 
p(x,0).  »  0  x?*a 

Pt (x,0)**  0  xfa 

where  S(t)  is  the  intensity  of  a  point  source  located  at  x-=a. 

The  solution,  p(x,t)  for  all  t>0  is  given  by 

p(x,t)  ■  S(t-|x-a|/c)  (E4.8) 

That  is,  the  signal  produced  by  the  point  source  divides  in  half, 
one  half  propagates  to  the  left,  x<a,  the  other  to  the  right,  x>a. 

The  propagation  is  along  the  rays.  If  many  point  sources  are  present, 
or  distributed  sources  are  assumed,  then  E4.8  generalizes  to 

PU,t)  -  (l/2)/dSA(t-|x-M/c)  (E4.9) 

where  dS^  is  the  signal  intensity  at  X“X ,  and  the  integration  is  in 
the  Riemann-Stielt jos  sense.  If  vanishes  for  all  then 

in  the  region  x<a, 

p(x,t)  ■  S  (t-jx-_a|/c)/2  (E4.10) 

where  S  is  an  equivalent  source  defined  by 
a 

Sa(t)  «=  / dS^  (t-  |a-X  |  / c) 

An  analogous  introduction  of  cn  equivalent  point  source  is  possible 
if  vanishes  for  l>a. 


D-64 


tt 


Mow  we  ask  the  question:  when  are  different  source  distri¬ 
butions  indistinguishable  to  an  observer?  That  is,  when  do  two 
different  source  distributions  differ  only  by  a  vector  which  lies 
in  the  nullspace  of  the  linear  operator  T:S-»p  defined  implicitly 
by  Wlp]-  -c$c  ?  For  the  observer  we  will  take  an  open  ray-connected 
set  D  of  the  (x,t)  plane,  and  by  ray-connected,  we  will  mean  that 

any  two  points  r,seD  can  be  connected  by  a  ("zig-zag”)  path  of 

/ 

ray-segments .  Let  be  the  set  of  constant  functions  on  D.  Letting 
Da  be  the  set  of  waveforms  from  a  point  source  a  that  are  observable 
in  D,  we  see  that 

THEOREM  1  If  D  is  an  observation  region,  Sa  and  are  point  sources, 

and  a<D<b,  then  D  n  D,  «0. 

a  b 

PROOF  Let  w(x,t)eDa0D^  and  take  any  two  points  (xo,tQ)  and 

(x^tp,  conne.ted  by  the  ray-path  r.  w(x,t)  is  a 
constant  along  each  ray  segment  of  (positive, negative) 
slope  since  (veD  ,  wcD.  ) ,  hence  w  is  constant  along 

3  D 

all  of  F  so  that  w(xG>  to)'=w(x^,t^) .  Since  the  points 
were  arbitrary,  veK^.  Since  neither  Da  nor  contain 
the  theorem  is  proved,  (constant  functions  are  not 
in  or  because  they  violate  the  initial  conditions) 

This  result  is  not  unexpected,  and  admits  obvious  generalizations 
to  distributed  sources  and  sources  within  the  observation  region.  A 
generalization  to  cover  random  sound  velocities  is  also  possible, 
although  not  so  obvious  (Figure  4, 2). 


D-bG 


THEOREM  2  If  D  is  an  observation  region,  and  are  point 

sources  with  a<D<b  and  transmission  is  governed  by 

W[u]  -  c2u  -u  „ 

XX  tt 

with  zero  initial  conditions,  and 
0<c<«c(x, t)<«c<“ 

and  c(*,«;  twice  continuously  differentiable,  then 


PROOF 


DanD.  “0. 
a  b 

Make  a  change  of  coordinates  in  the  equation 
W[u]  -  c2(x,t)uxx-utt*h(x,t) 
by  letting 

t«t(x,t)  rrn(x.t) 


(E4 , 11) 


where  we  assume  that  the  transformation  ij;:  (x,t)-*(£  ,n) 
is  everywhere  invertible,  i.e.,  £  fO.  We  find: 

u«Ic2£x  ~Kt2]  +  uenfc2{;xnx  +  unnCc2nx 


-h(x(c,n) ,t(5,n))  -  g(S,n) 


(E4.12) 


If  the  transformation  ii>:(x,t)-*-(£,n)  is  chosen  so  that 


cC  «E 
x 


cnx-nt 


(E4.13) 


(solution  of  these  two  first  order  partial  differential 

2 

equations  is  certain  since  c  is  in  C  )  then  the  transformed 
equation,  E4.12,  becomes 

2*tntu6lTs(e.n> 

Since  c  is  bounded  away  from  zero,  £c  can  be  zero  only 
if  Cx  Is  zero  (EA.13).  But  if  either  is,  both  are,  and 
the  transformation  becomes  singular,  which  has  been  ruled 
out.  Put  another  way,  solutions  of  E4.13  which  are 
bounded  away  from  zero  exist  since  c  is  bounded  away  from 
0;  see  Courant  [1],  page  491  et.  seq. 


D-67 


Aa  a  result,  K4.12  can  be  written  as 

uCn“  6(C,n)/2Ctnt  (EA.14) 

Tills  is  a  canonical  form  for  the  constant  coefficient 
wave  equation;  ^constant  and  n-constant  are  characteristics 
and  rays  of  the  solution.  Theorem  1  now  applies  showing 
that 

Daa,n)nDb(C,n)  -  0 
and  the  theorem  is  established. 

It  Is  now  easy  to  prove  that  random  wave  velocities  do  not  enlarge 
the  nullspace  t.f  the  transmission  operator  very  much: 

THKOKl.M  3  If  P  Is  an  observation  region,  Sfl  and  Sb  are  point  sources 

and  transmission  is  governed  by 

W  [ ii ]  *  c^(x,t)u  — u 
cl  '  *  xx  tt 

with  aiu  initial  conditio  is,  and  c  is  randomly  chosen 
from  a  set  C*  each  of  whose  elements  c  satisfy 
0<c<“c(x,c)<,■c<,® 

and  e.ich  of  which  la  twice  continuously  differentiable, 

then  I;  i  b.'  d.  (by  I)  wc  mean  u  D_(c) 

'*  °  fl  ccC*  8 

PRO  >K  D  (c).'iU,  (c;=0  by  Theorem  2. 

a  b 

This  result  shows  that,  with  only  two  discrete  directions  from 
which  signals  can  come,  random  propagation  can  not  hurt  your  discri¬ 
mination.  A  random  medium  Just  ia  not  able  to  make  a  left  going 
wave  look  like  a  right  going  one. 

Best  Available  Ceoy 

i  / 


13-63 


4.3  The  Four  Variable  Wave  Equation 

Mow  let  W  be  the  differential  wave  operator  in  4  variables: 

W[p]-vVPtt  (E4.15) 

We  are  interested  in  an  inverse  problem  in  differential  equations: 
given  suitable  x  and  the  equation  W[p]*x  with  suitable  boundary 
conditions,  we  con  solve  for  p.  But,  given  p,  can  we  Bolve  for 
x?  The  obvious  answer  is  yes:  everywhere  p  is  known,  x  ia  known 
also,  Just  apply  W  to  p .  But  can  x  be  determined  in  a  region  of 
the  independent  variables  removed  from  the  region  in  which  p  is  known? 

In  general  the  answer  is  no.  With  further  assumptions  about  x, 
however,  it  can  be  yes. 

This  whole  area  of  inverse  problems  in  parital  differential 
equations  is  difficult  and  relatively  untouched:  the  problems  are 
generally  ill-conditioned.  If  we  think  In  terms  of  some  inverse 
operator  then  J  wil'  be  unbounded.  What  little  that  has  been 

done  with  these  problems  has  been  done  with  relatively  tractable 
equations:  Poisson's  equation,  where  the  entire  apparatus  of  complex 
variable  theory  can  be  used,  and  equations  In  two  independent 
variables  where  reduction  to  ordinary  differential  equations  along 
characteristics  is  a  powerful  trick.  (Lavrentiev  [1]) 

A  complete  treatment  of  this  problem  can  not  be  expected,  then. 

The  most  we  can  hope  to  do  is  to  treat  a  few  special  cases  and  to 
acquire  some  insight. 

Looking  at  Figure  4.3,  we  see  that  at  each  point  V»(t,x,y,z) 

4 

in  R  ,  a. characteristic  conoid  exists  for  the  operator  W.  This 
conoid  is  a  right  circular  45°  hypercone,  axis  parallel  to  the  t-axis . 

Let  PQ  be  a  hypcrplane  t”tQ>tv  and  Pg  a  hyperplane  t»tg<tv.  Intuitively, 


D-6& 


?0  is  the  observation  plane,  end  Is  the  given  plane,  or  the 
plane  of  assumed  values.  The  conical  frustrum  whose  base  la  the 


intersection,  B0,  of  Pq  and  H  is  mirrored  by  reflection  across  Pe 
Into  a  frustrum  of  another  right  circular  45°  hypercone,  L,  which 


has  a  vertex  W  in  the  plane  t-tjj-2t0-tv.  The  sheet  of  L,  called 

L,  which  lies  below  t  intersects  the  plane  P  also;  we  call  the  base 

L  o 

thus  determined  B  while  the  base  of  H  in  the  plane  Pg  is  B^. 


The  uniqueness  (but  not  the  existence)  of  any  solution  to 
the  Initial  value  problem 

W[u)  =0  inL  n  I 

(u,ufc)  *»  (<fr0,<ij)  on  Bq 

is  established  by  the  following  well  known  theorem.  The  proof, 
based  on  an  "energy"  integral,  is  worth  repeating  for  the  light  it 
sheds  on  the  behavior  of  the  wave  equation. 


THEOREM  4  A  solution  u  to  the  differential  equation  W[u]*0 
in  Ln  H  satisfying  arbitrary  initial  conditions 

on  (where  is  twice  continuously  differentiable 
and  ^  is  once  continuously  differentiable)  is  unique. 

PROOF  We  will  show  that  the  initial  conditions  uniquely  deter¬ 

mine  the  value  of  the  solution  at  the  point  W.  As  every 
other  point  within  LftH  is  the  vertex  of  a  characteristic 
cone  whose  base  is  within  the  sphere  B^,  this  will 
establish  the  uniqueness  of  the  solution  throughout 
LnH,  Suppose,  then,  that  and  u^  are  both  twice 


D-71 


continuously  differentiable  solutions  of  W[u]"0,  with 

the  same  initial  conditions  uj[“u2*^i  on  ®o* 

Their  difference,  u-u^-^,  must  be  everywhere  zero  if 

uniqueness  la  to  obtain.  Hotlng  that  u  satisfies  W[u]-0 

within  LaH  with  zero  initial  data  on  B0,  we  Integrate 

over  all  of  L  above  B  ,  that  is,  K: 

—  o 

0«/3u/3t(32u/3t2  -  V2u)dxdt 
K 

since 

3u/Dt  •  32u/3t2  -  (3/3t)(3u/3t)2/2 

and 

3u/3t  •  32u/3x^2  ■  (3/3x^) (3u/3t  •  3u/3x^)  - 

3u/3Xj,  •  32u/3t3x^ 

-  O/dx  )(3u/at  •  Bv/Bx^  - 
(3/30(30/3x^2/2 

we  ace  that  the  above  integral  can  be  writt»  1  as 

U»/  0/-l)lOu/:a)?+  l(3u/3x  )2j/2  - 

K  i 

E(3/Dx^) (3u/Dt  •  3u/3x^)}dxdt 
i 

Since  G.niss *  theorem  cays  /V*Fdv  •  /F»n<tat  we  have: 

V  S 

0«()/j){/d-  i  f  dc»]{((?u/3t)2+  E(Du/3x.)2]cos(n,t) 

Bo  R'  i 

-2L(3u/3t) (3u/Dx.)cos(n,x.)}  (E4.16) 

i 

where  K  ’  in  the  Interul  surface  of  K.  The  integral 

over  B  is  zero  -.ince  the  Initial  data  arc  zero. 

<> 

Multiplying  and  dividing  by  cos(n,t),  which  is  constant 
on  K',  and  using  the  identity,  good  on  K', 

cos2(n,t)  “  Ecos2(n,Xj)  (E4.17) 

I  >-72 

Best  Available  Copy 


we  obtain 


0*(l/2cos(n, t))/  E[ (3u/3t)cos(n,x.)-(3u/3x  )cos(n,t)]2do 
K'i  1  1 

From  this  it  follows  that  on  K' 

ut/cos(n,t)  »  u^/cosfr.x^)  -  v 

Using  this  fact,  we  evaluate  the  change  in  u  along  lorae 

generator,  m,  of  K: 

3u/3m  ■  utC08<m,t)  +  Eu^cosOn.x^) 

■v  l  cos  (n ,  t)  cos  (m,  t)+  Ecos  (n, x^)  cos  (m ,  Xj)  ] 

■vcos(m,n) 

which  is  zero  since  a  generator  and  normal  are  perpendicular. 

Now,  letting  the  generator  m  meet  in  the  point  m  , 

°  o 

W 

/  (Su/3m)dm  ■  0  ■  u(W)-u(mo) 
mo 

and  since  u(mo)»0,  so  does  u(W) ,  which  establishes  the 
theorem. 


This  theorem  can  be  interpreted  in  different  ways.  From  the 
usual  point  of  view  it  proves  the  uniqueness  of  a  linear  operator 
W  which  maps  )eC2(B  )xC^(B„)  into  ueC2(LOH).  Applying  the 

theorem  to  the  sphere  Bg,  a  more  symmetric  view  has  us  consider 

9  1 

the  linear  operator  V  which  maps  (^>^t)eC  (B  )xC  (B  )  into 
2  1 

(u,u.)eC  (B  )xC  (B_).  Since  V  maps  functions  on  a  given  domain 
c  o  u 

into  functions  with  a  smaller  domain,  intuition  says  that  V  should 
be  many  to  one,  i.e.,  have  a  non-vanishing  kernel.  If  this  were 
not  the  case,  then  the  following  theorem  would  hold: 


D-73 


Conjecture  J  A  solution  to  the  differential  equation  W[u]*»0 

in  L  satisfying  the  conditions 

(  n , ( )  **  <  •+*  *  *’  t )  on  B0 

2  1 

with  ^  i.C  and  u«*0  in  H,  is  unique. 

Attempted  l’rooi  :V.’e  will  attempt  to  prove  uniqueness  by  considering 

the  difference,  u,  ot  the  two  solutions  and  Uj  with 

the  same  data  on  11  .  Since  L  extends  to  -co  ,  we  will 
o  — 

introduce  a  hyperconc  J  with  upper  sheet  J  and  axis  r«0, 
vertex  V  at  some  point  below  V,  (Figure  4.3)  and 
consider  u  only  within  the  region  LnJ.  Within  LnH 
l..,  \  none:,:,  follows  directly  from  Theorem  4.  Within 
lid!,  u  I  .  /.too  by  assumption.  This  leaves  a  region 
l.f  t  1. vun.il  d  ou  the  outside  by  portions  L",  J'  of 

the  .sheets  ol  L  and  J,  on  the  inside  by  portions  H'  and 
v*  ( i  ’i  .  -r  and  lower  sheets  of  II.  The  steps  in  the 
....  •:  j,.  u  catty  o/Ct  for  u.l.-. .  region,  up  to 

i  ..ppeuis  here  with  4  sui  faces  instead  of  2: 
tv*  .’*}!'  do)  j  do  1  /  do  )  /  do ](...) 

!.'  j  ir  ir 

i ;  i  .  1! '  .aid  il*  vanish  since  u  is  ?ero 

.*  I.*.,  f  i*i'c  ,  wt:-  3  ciply  anJ  divide  by  cos  (n,t), 

iij-p.  lue  Identity  1.4  .  J  7  to  find: 

j  :  .  .  ....  ■  ,  d  ,)>  .  ..} 

.  , « 

’li  e  iv.i  ‘  » 1 .;  h.i'.'e  opposite  signs  because  one  (  + 

,  ....  outgo! w.ivru  across  L",  while  the  other 

...  '..  v  i  ..vi  s  across  J'.  Thus  fails  the  proof. 


This  failure  of  an  attempted  proof  does  not  show  that  the 
conjecture  Is  false,  but  it  strongly  suggests  that  it  la.  Note 


that  this  proof  can  be  carried  through  successfully  in  one  spatial 
dimension  because  ingoing/outgoing  translates  into  lef tgoing/right- 
going  in  one  spatial  dimension,  and  these  wave  types  are  independent. 
The  proof  also  is  successful  in  3  dimensions  if  u  is  restricted  to 
be  spherically  symmetrical,  for  in  that  case,  ingoing/outgoing 
waves  types  are  independent. 

The  initial  conditions  on  B  that  lie  in  the  kernel  of  the 

g 

operator  V  are  still  unknown,  but,  it  seems  that  there  are  some. 
Furthermore,  this  kernel  of  V  is  in  addition  to  the  kernel  con¬ 
sisting  of  functions  which  are  zero  on  B  “Bgh*  These  are  in  the 
kernel  of  V  because  Huyghen's  principle  holds  in  3  spatial  dimensions, 
so  that  Bg~Bgh  *-s  the  complete  domain  of  dependence  of  or  the 
point  W.  In  two  spatial  dimensions  (in  fact,  all  even  spatial 
dimensions)  by  contrast,  the  failure  of  Huyghen's  principle  makes 
V  even  less  invertible. 

While  this  discussion  has  been  about  V,  an  operator  from  initial 
conditions  into  initial  conditions,  by  Duhamel's  principle  the  same 
will  be  true  of  the  transmission  operator  that  maps  into  u  from 
source  distributions.  That  Is,  the  transmission  operator  will  have 
a  kernel  whenever  u  can  only  be  observed  in  a  limited  region.  In 
order  to  display  one  way  in  which  this  can  happen,  we  look  at  the 
field  produced  by  a  spherical  shell  source  h(r,t)-h(t)S(r«s) . 

Green's  function  in  the  transform  domain  is 
gw(r,ro)“exptiwRj / AirR 

R  -  l*-*J 


D-75 


so  the  field  at  r  is 
o 

U(r0.s,u)  -  /dvH(w)6(r»s)exp[iti)R]/4irR 
where  H(u)  is  the  Fourier  transform  of  h(t).  A  little  algebra  gives: 

U(ro,s,u)  ■  sH(w)exp[iws]sin{wr0] /wr0 
From  this  we  see  that  exact  cancellation  is  possible  everywhere 
(t^s)  inside  the  spherical  shell  source  if  only  a  second  spherical 
shell  outside  the  first,  at  a  radius  q>s,  is  excited  by 

C(w)  “  ll(u:)exp[l<i)(s-q)  ]  s/q 

This  cancellation  depends  upon  the  spherical  symmetry.  To 
see  just  how  critical  this  dependence  ia,  consider  two  hemispherical 
sources  at  radii  s  and  q.  One  finds  that  the  field  from  one 
hemisphere  is: 

U(r0,s,w)  *  sH(t4{exp[iw(r2+s2)1/,2]-exp[iu)(r0-s)]l/2ii»)ro 
and  for  both  it  is: 

2iwro[UH-l!G]  -  sHCttOexpUuKr^+s2)3^2]  -  sH(m)expliw(r0-8)  ] 
-qG (w) exp [ iu> (r2+q2 )  ^  2  ]  +  qG(w) exp [iuj(ro-q)  ] 

Proper  choice  of  GfuOcan  cause  cancellation  of  the  second  and 
fourth  terms,  but,  no  cancellation  of  the  first  and  third  terms  is 
possible.  Soir.a  leakage  around  the  edges  of  the  hemispheres  always 
occurs , 

If  the  wave  equation  is  written  for  the  velocity  potential  i|j, 
where  particle  velocity  is  ye*-Viji  and  pressure  is  p*p^t  for  a 
density  p,  then  the  energy  flux,  or  intensity  is 

(E4.18) 

Given in  eoir.e  region,  wliat  cancels  it?  Obviously,  it  is  -if>.  But, 

I  (CO  “  I  (-it) 

That  is,  the  wave  energy  flux  is  the  same  for  and  In  words, 


D-7d 


Che  cancelling  wave  is  going  in  exactly  the  same  direction  at  every 
point  of  cancellation.  This  explains  the  leakage  discovered  in  the 
two  hemisphere  problem;  failure  of  spherical  symmetry  at  the  edges 
of  the  hemispheres  results  in  the  generation  of  waves  that  are  not 
going  in  exactly  the  same  direction. 

Now  let  a  point  source  launch  a  wave,  ip,  and  follow  a  portion 
of  the  wave  front  that  travels  towards  the  origin.  This  portion  of 
the  wavefront  can  be  cancelled,  but  only  be  another  wavefront 
traveling  the  same  direction  but  with  opposite  sign:  -ip.  Such  a 
wavelet  could  only  be  generated  by  a  (portion  of)  a  spherical 
shell  with  the  point  source  as  center.  In  particular,  a  spherical 
shell  of  large  radius  with  center  at  the  origin  can  not  launch 
such  a  wave. 

Summarizing  these  bits  and  pieces  of  evidence,  we  state 

Conjecture  2  Let  u  be  determined  by 

W[u]«h(t)6(x=jto)  +  g(6,4>)6(r=r0) 

(u,ut)  =  (0,0)  at  t*»-oo 

h(t)  a  given  point  source  at  the  point  ^^0 
g(6,$)  a  given  spherical  shell  source  at  radius  r  . 

Then  u  is  non-zero  in  any  open  connected  set  containing 
the  origin. 

This  conjecture  requires  proof  from  a  mathematical  point  of 
view.  From  an  engineering  viewpoint,  it  can  be  regarded  as  true. 
Extensions  to  a  finite  sumber  of  point  sources  and  non-concentric 
6hells  follow  from  the  truth  of  this  conjecture. 


D-77 


4. 4  Passive  Sonar  In  One-space 

Following  the  lines  of  section  2.4  we  begin  with  the  construc¬ 
tion  of  a  class,  call  it  OSD,  of  level  4  C/D  models.  Models  in 
this  class  differ  only  In  their  measures,  and  the  class  is  broad 
enough  to  encompass  a  vide  variety  of  specific  sonar  models. 


Space 


Meaning  and  Description _ 

{0,1},  standing  for  {no  signal, signal)  as  usual. 

Encoding  operators  e^  with  values  in  defined  by 
eh(a)  -  ah 

The  set  of  encoding  operators  is  isomorphic  and  isometric 
to  under  the  map  ipte^h. 

Source  distribution  functions  for  the  plane,  representing 
the  signal  and  forming  a  subspace  of  S^.  These  are 
limited  by  the  assumption  that  h(x,t)«0  for  all  x:>-a, 
that  is,  the  signal  is  confined  to  a  right  half-plane. 
Additive  noise  operators,  n  (h)=*g+h  with  values  in  S^. 

The  noise  operators  are  isomorphic  and  isometric  to  a 
subspace  of  under  the  map  i^sn^g.  This  subspace  is 
restricted  by  the  assumption  that  g(x,t)"*0  for  all  x<6, 
so  that  the  noice  is  restricted  to  a  left  half  plane. 
Source  distribution  functions  for  the  plane.  These 
functions  are  intended  to  lie  within  the  domain  of  the 
transmission  operators  in  the  space  S^,  but  eince  they 
enter  via  on  integral,  no  great  number  of  restrictions 
need  be  placed  upon  them.  We  choose  the  space  of  all 
l.ebesgue  square  integrable  functions  in  all  6pace-time. 


D-7  8 


Space 


"f? 


Meanlng  and  Description _ _ _ 

o  2  * 

Intuitively,  this  space,  L  (R  ) ,  is  reasonable,  representing 
iti  it  dees  a  finite  energy  constraint  on  the  signals  and 
noise. 

Transmission  mappings,  Tc,  from  into  S^,  defined 

implicitly  ly  solution  of 

2 

c  (x,t)uxx-utt-h(x,t) 

where  c(x,t)  is  twice  continuously  differentiable  and 
0<c<»c  (y, ,  t)  <*=c<“> 

and  hcSe.  The  Cauchy  initial  data  are  prescribed  as 
(u,ut)®(0,0)  on  the  line  t=-<*>. 

All  solutions  of  the  inhomogeneous  wave  equation,  but 
considered  in.  a  given  observation  region  D  with 
a<D<3 

2 

Sj  is  a  linear  manifold  within  L  (D)  in  view  of  Sr. 

Detection  operators.  All  possible  measurable  maps  of 
Sy  into  Sg.  Which  ones  are  chosen  depends  upon  the 
particular  model  of  class  OSD  which  is  being  considered. 
{0,1}  standing  for  {noise  alone,  signal  plus  noise). 


N’t  now  t'  '..Not  the  class  OSD  for  singular  models,  If  M  is  one 
of  the  factorable  model  •.  Jr  OSD,  then  etr.ge  }i  is  singular  by 
theorem  3.7,  while  stage  is  singular  (i.e.,  preserves  singularity) 
by  theorem  3.1.  Sta,  t-  11  is  also  teen  to  be  singular  by  application 

j 

of  theorem  3..1  combined  with  theorem  4.3.  This  leaves  only  the  fourth 
stage,  the  detector,  between  us  and  singularity  of  the  entire  model. 


* 


Bun  any  detector  that  Maps  the  support  of  into  icS^  preserves 
singularity,  and  there  are  many  of  these.  Our  conclusion,  then, 
is  that  all  ot  the  factorable  moods  in  OSD  that  have  "good" 
detectors,  which  certainly  Includes  any  optimal  (Baysian)  detectors, 
are  singular,  hence  inadequate  as  analytic  guides  to  reality. 

Ti;2.s  holds  for  quite  general,  signal  and  noise  locations  (as  long 
as  they  are  disjoint  and  the  detector  ia  between  them)  and  for 
arbitrary  sound  velocities,  even  random  velocities  (as  long  as 
they  arc  bounded  away  1  torn  zero,  and  glossing  over  the  inadequacy 
of  the  scalar  wave  equation's  description  of  sound  transmission 
when  the  sour.  I  velocity  is  not  slowly  varying,  which  fact  really 
means  that  «owu  of  the  models  in  the  class  OSD  re  not  good  images 
ol  reality,  singular  or  not).  Such  generality  is  possible  because 
oi  the  goon*.* trie  .-.it;  Jlcity  of  the  problem:  detection  really  reduces 
to  a  de terminal  io..  <  the  direction  of  the  incident  waves:  left- 

g-  •  1  lig  i.v. .  1  ,  .  .  ,  :  .  }  ,ii.  (.’.ill).  . u:.  Itltisu. 


Best  Avail, 


Quits  i  ;r 


1 >  ■>■>■■> 


4.5  Passive  Sonar  in  Three-space 

Relatively  little  In  the  definition  of  the  class  OSD  needs  to 
be  changed  in  order  to  generate  a  class,  TSD  of  4  stage  C/D  models 
applicable  to  sonar  problems  in  3  spatial  dimensions.  We  have: 

Space _ Meaning  and  Definition _ _ 

(0,1),  standing  for  {no  signal,  signal} 

S2  Encoding  operators  eh  with  values  in  $3  defined  by 

e^a)  •  ah 

The  usual  identification  with  la  provided  by  the 
map  f:e^+h. 

Source  distribution  functions  for  r\  representing  the 
signal  and  forming  a  subspace  of  S^. 

S,  Noise  operators,  n  (h)«g+h.  The  usual  embedding  of  S 

^  o  H 

into  S  is  provided  by  i^:n  -*-g.  Notice  that  S-,  and  S. 
b  g  j  4 

have  not  been  restricted  to  half-spaces  as  they  were 
in  the  class  OSD. 

Source  distribution  functions  for  4-space,  equal  to 
L^(R^),  The  discussion  of  in  the  definition  of  OSD 
applies  here  as  well. 

S6  A  single  transmission  operator,  Tsh-m  as  defined  by 

solution  of  the  problem 
Vzu-utt“h(z, t) 

(u,ut)“(0,0)  at  t«-“ 

The  theory  for  the  class  TSD  is  not  as  comprehensive 
as  that  for  OSD,  as  the  degenerate  nature  of  indicates. 


D-81 


Space 


Meaning  and  Definition 


Defined  by  solution  of  the  wave  problem  above,  A  linear 

2  4 

manifold  within  L  (D)  when  restricted  to  a  compact  set  DcR  . 

S  Detection  operators,  all  possible  measurable  maps  of  S_ 

6  / 

into  Sg,  but  with  an  implicit  statement  of  the  observation 
domain.  This  represents  a  slight  change  from  S  of  class 

O 

OSD.  There,  the  observation  domain  was  explicit  in  the 
definition  of  S^,  rather  then  implicit  in  the  choice  of 
a  detection  operator  from  SD. 

O 

Sg  {0,1}  for  {noise  alone,  signal  plus  noise} 


Each  model  in  TSD  is  singular  in  stage  by  theorem  3.7,  and 
each  factorable  model  for  which  suppy^  and  suppy^  are  disjoint  is 
singular  in  stage  M2  by  theorem  3.1.  Since  contains  waves  defined 
throughout  r\  etage  is  singular  whenever  is  by  application  of 

theorem  3.5.  This  leaves  the  detector  between  us  and  singularity 
of  the  model.  (notice  that  this  chain  of  reasoning  differs  from 
that  used  in  discussing  the  singularity  of  OSD.  There  stage  3  was 
key,  and  singularity  occurred  when  the  detection  region  was  between 
the  signal  and  noise  sources.  Here  we  are  pushing  the  key  problems 
back  to  stage  4,  the  detection  stage.) 

When  vi,  peaks  up  to  one  on  spherical  shell  sources  .(at  fixed 
4 

or  variable,  known  or  unknown,  radii)  and  y^  P®aks  up  to  one  on 
point  sources  (fixed  or  variable  location,  one  or  any  finite 
number  of  sources)  located  inside  the  shell  sources,  but  not  at  the 
centers  of  the  shell  sources,  and  the  detector  selected  from  Sg 


D-82 


observes  an  observation  region  containing  the  origin,  then,  if 
the  detector  is  optimal  the  model  Is  singular  according  to 
conjecture  4.2. 

Some  of  the  models  considered  by  Vanderkulk  [1]  (those  without 

self-noise)  are  members  of  the  class  TSD.  ia  singular,  as  ever. 

ii^  peaks  up  on  a  single  point  source  at  infinity,  while  peaks 

up  on  a  shell  source  of  infinite  radius.  S_  and  S  are  effectively 

3  4 

separate  linear  spaces  so  that  is  singular,  (theorem  3.1),  not 
by  virtue  of  the  spatial  separability  of  the  sources,  but,  because 
p^  generates  a  process  of  independent  (spatiai)  increments  on  the 
surface  of  the  shell.  The  p^  generated  process  of  point  sources 
has  measure  zero  since  all  of  the  rest  of  the  sphere  has  zero 
excitation.  is  singular  again,  which  brings  us  to  the  detector. 

As  the  number  of  phones  in  the  Vanderkulk  model  increases  to  oo , 
observation  becomes  continuous  and  the  whole  model  becomes  singular, 
as  he  shows.  This  result  supports  conjecture  4.2. 

A  fruitful  way  of  looking  at  these  results  is  this:  an  optimal 
detector  fed  continuous  observations  can  form  a  zero-width  beam 
pattern,  and  perform  perfect  range  discrimination  for  point  sources 
(any  finite  number  of  them).  The  key  is  the  exactly  known  wave-front 
available  from  the  source (s).  Other  signal  models  that  provide 
exactly  known  wave-fronts  will  likewise  be  singular  in  the  limit 
of  continuous  observation. 

The  introduction  of  self-noise  at  the  hydrophones  is  not  a 

cure  for  this  singularity  (Vanderkulk  [1]).  A  slight  modification 

of  the  class  TSD  suffices  to  include  the  self-noise.  Spaces  S  and 

© 

Sg  of  TSD  become  S^q  and  of  TSD' ,  and  new  spaces  are  introduced: 


D-83 


Space 


Meaning  and  Dell nltlon 


Solutions  o £  the  wave  equation.  Subspace  of  Sy  after 
restriction  to  the  observation  set  D. 

Self-noise  injection,  <5  (h)*=k+h.  S  is  embedded  into  a 

&  8 

linear  subspace  of  by  the  map  ijud^+'k. 

L  (D)  where  D  is  a  compact  observation  set  in  It* 

Cold  Sfi) 

(old  S9) 


The  self-noise,  stage  4  of  TSU’ ,  fails  to  remove  the  singularity 
because  it  is  spatially  v.'hite ,  its  power  spread  equally  over  all  of 
S^,  so,  it  has  zero  power  on  any  one  dimension  of  ,  (in  the  limit 
of  continuous  oLservutionj . 


I'-s-l 


4.6  Implications 

The  general  conclusion  to  be  drawn  from  all  of  this  is  that  the 
available  sonar  models  are  Inadequate.  How  can  they  be  improved? 

The  possibilities  are:  stages  1  and  2  might  be  modified  to  cauee 
the  signal  and  the  noise  to  overlap,  but,  this  means  putting  power 
from  the  noise  at  the  same  spatial  locations  as  the  signal,  and 
since  the  signal  can  be  anywhere,  the  whole  volume  of  3-space  must 
be  filled  with  noise  sources. 0  At  the  same  time,  the  signal  must 
be  made  into  a  distributed  source  in  order  to  destroy  the  perfect 
wave-front  generated  by  a  point  source.  This  medicine  seems 
excessively  bitter  ■ —  the  noise  model  that  results  has  little 
resemblence  to  the  nclsa  sources  that  we  think  are  present  in  the 
ocean.  Furthermore,  point  source  signals  should  be  permissible 
since  any  distributed  source  of  finite  dimensions  looks  like  a 
point  source  as  it  recede  t. 

As  we  have  seen,  introduction  of  self-noise  does  nothing  for 
us,  so,  as  the  only  remaining  possibility,  the  transmission  stage 
must  be  modified.  The  cure  is  easy  to  talk  about,  difficult  to 
use.  It  consists  of  modeling  the  randomness  of  the  medium.  If 
there  is  a  low  frequency  signal  cut-off,  this  modeling  can  be 

tf  Even  this  might  not  work  since,  in  the  Gaussian  noise  case 
with  independent  radiators,  a  finite  radius  spherical  ensemble 
produces  the  same  correlation  function  within  the  sphere  as  does 
a  spherical  surface  ensemble  (Cron  ll]  [2]).  There  is  also  the 
problem  of  infinite  energy:  if  an  infinite  radius  spherical  volume 
ensemble  is  postulated,  there  must  be  zero  energy  generated  in 
every  differential  element  of  volume! 


D-85 


accomplished  by  perturbing  the  speed  of  sound  in  the  wave  equation 
since  volume  inhomogeneities  can  then  be  assumed  to  be  much  greater 
than  a  wavelength  (section  4.1).  T^,  the  transmission  operator 

perturbed  by  a  small  amount  X,  maps  x  into  u  as  defined  by  solution 
of 

(c  +  X(z, t))2V2u-utt-x 
(u,ut)  =  (0,0)  at  t“-“> 

Unfortunately,  it  is  difficult  to  estimate  the  realism  of  the 
low  frequency  cut-off  assumption.  Officer  [1),  for  instance,  gives 
estimates  for  when  the  eikonal  equation  Is  a  good  approximation  to 
the  wave  equation,  but  not  for  when  the  wave  equation  is  a  good 
approximation  to  the  physical  situation.  It  is  probably  not  too 
good,  especially  in  the  upper  ocean  where  a  fair  amount  of  sea-life 
serves  to  complicate  things  by  creating  smaller  scale  volume 
inhomogeneities . 

In  order  of  decreasing  realism,  and  increasing  analytic  ease, 
it  is  suggested  that  models  be  modified  to 

1.  Contain  the  transmission  operator  defined  by  the  wave 
equation  with  random  speed  of  sound  and  (scattering  type) 
terms  due  to  low  frequency  signals. 

2.  Contain  the  transmission  operator  defined  by  the  wave 
equation  with  random  speed  of  sound,  without  scattering 
terms . 

3.  Contain  wave)  rnr.t  perturbation  noise  introduced  aa  arrival 

time  jitter  at  each  hydrophone. 


W-8ti 


Only  3  seems  simple  enough  to  lead  to  analytic  results.  However, 
extensive  analytic  and  numeric  work  with  1  and  2  should  be  done  to 
model  the  statistics  of  the  arrival  time  jitter  process.  This  kind 
of  modification  of  sonar  models  should  have  a  significant  effect 
on  the  results  of  sonar  analyses-  It  Is  probable,  for  instance, 
that  a  point  source  of  interference  will  cost  considerably  more 
than  one  hydrophone  to  null  (Schultheiss  [2])  when  perfect  wave- 
fronts  are  eliminated.  It  is  not  clear  what  effect  a  model  with 
Jitter  will  have  on  detection  in  the  limit  of  continuous  observation. 
Modeling  jitter  in  that  situation  should  provide  an  interesting 
mathematical  challenge,  as  it  would  seem  to  require  a  stochastic 
process  whose  elements  are  (continuous?)  maps  of  a  set  into  Itself. 


5.  SUMMARY 


After  providing  a  summary  of  the  contributions  made  by  each 
of  the  previous  chapters,  ve  present  a  short  list  of  further 
research  topics  ( of  all  sizes) , 

5.1  Contributions  of  this  Work 

The  chief  contributions  made  in  this  work  may  be  divided 
conveniently  along  chapter  lines: 

Chapter  Contribution _ _ _ 

1  Analysis  of  array  design  problems,  showing  their  relation¬ 
ship  to  model  singularity. 

2  Development  o£  a  means  for  classifying  most  models  of 
communication  and  detection.  Presentation  of  an  adequate 
and  precise  definition  of  model  singularity. 

3  Discovery  of  an  underlying  feature  of  singularities  in 
certain  kinds  of  models,  namely,  inequality  of  the 
signal  and  noise  subspaces. 

4  Application  of  these  results  to  sonar  detection  models, 
with  the  conclusion  that  models  currently  used  are 
inadequate  and  may  give  misleading  results.  Suggestions 
for  improving  the  models  are  given. 


5.2  Possible  Direstions  for  Further  Work 

The  possibilities  presented  for  further  work  fire  also 
conveniently  treated  on  a  chapter,  section  basis:  (we  restrict 
this  list  to  problems  directly  suggested  by  the  work  presented 
here) 


Section 

1.3 


2.2 


2,3 


2. A, 5, 6 


2.7 


Suggested  Extensions  or  Modifications _ 

Any  extension  of  the  theorems  on  orthogonality  versus 
equivalence  of  Gaussian  measures,  as  revealed  by  proper¬ 
ties  of  their  covariances,  to  the  sonar  case  would  be 
very  interesting. 

The  model  apparatus  defined  in  this  section  provides  a 
convenient  skeleton  for  a  taxonomy  of  detection  models, 
the  compilation  of  which  would  serve  to  consolidate  the 
understanding  of  C/D  problems  that  has  been  achieved  so 
far  and  prepare  a  base  to  support  further  achievement. 
Sufficient  conditions  for  existence  of  induced  measures 
are  needed. 

Further  examples  could  profitably  be  investigated.  N 
dimensional  Gaussian  processes,  as  well  as  processes 
defined  more  directly  by  their  sample  spaces  await 
treatment  (Parthasarathy  [1]).  Additional  topologies 
might  be  investigated. 

Additional  performance  criteria  could  well  be  inves¬ 
tigated  for  continuity  properties:  Neyman-Pierson, 
for  instance,  is  closely  related  to  the  Baysian  risk. 
Maximum  information  transfer  is  another  candidate  for 


D-89 


Section 


4.3,5 


Suggested  Ext e n s Iona  or  Modifications _ _ 

Investigation.  Aloo,  while  continuity  of  n  in  P  and  P* 

V* 

has  been  shown,  a  proof  of  continuity  in  P  ia  lacking. 

A  good  deal  of  open  ground  lleo  here,  but  it  may  continue 
to  lie  tallow  through  infertility.  At  any  rate,  conjecture 

4.1  could  use  a  counterexample  or  a  proof  while  conjecture 

4.2  needs  a  proof,  and,  many  theorems  similar  to  conjecture 
4.2  need  to  be  investigated  (they  would  differ  from  con¬ 
jecture  4.2  chiefly  in  the  source  geometries  assummed). 

This  can  be  paraphrasedby  saying  that  a  much  deeper 
understand  .’tig  of  the  transmission  operator  is  a  pre¬ 
requisite  to  better  understanding  of  sonar  models. 

.fitter  •  i  !e  1 beckon.  Also,  analytic  work,  numeric 
and  field  experiment:,  to  provide  the  statistics  of  the 
jitter  process.  Analytic  work  to  justify  tie  Jitter 

.  .ii’P.i  M.iii:  .  l  s  p  Vs  *  |  'O  r  t  K  is.  the  1.  i  t,i  i  t  ot 

,  ....  s  .  v. 1 1  i  on  . 


i  »-:><) 


BIBLIOGRAPHY 


4 

Anderson, 

1 

Arons zajn, 
1 

2 

Bellman,  R 
1 

Bryn,  F. 

1 

Courant,  R 
1 

Cramer,  H, 
1 

Cron,  B.F. 
1 

2 

Feldman,  J 
1 

2 


V.C. 

"Digital  Array  Phasing",  JASA,  32  (1960)  867 

N. 

"La  Theorle  des  Noyaux  Reproduisants  et  ses  Applications. 
Premiere  Partie",  Proc  Cambridge  Phil.  Soc.  ^9  (1943)  133 

"Theory  of  Reproducing  Kernels",  Trans.  Amer.  Math.  Soc. 

68  (1950)  337-404 

Dynamic  Programming .  Princeton  University  Press,  Princeton, 
1957 


"Optimal  Signal  Processing  of  Three-Dimensional  Arrays 
Operating  on  Gaussian  Signals  and  Noise",  JASA  ^4  (1962) 
289-297 

.  and  Hilbert,  D. 

Methods  of  Mathematical  Physics,  Volume  II:  Partial 
Differential  Equations,  Interscience,  New  York,  1962 


Mathematical  Methods  of  Statistics,  Princeton  University 
Pre^s,  Princeton,  1963 

and  Sherman,  C.H. 

"Spatial-Correlation  Functions  for  Various  Noise  Models", 
JASA  34  (1962)  1732-1736 

"Addendum:  Spatial-Correlation  Functions  for  Various 
Noise  Models  (JASA  34,  1732-1736)",  JASA  38  (1965)  885 


"Equivalence  and  Perpendicularity  of  Caus6ian  Processes", 
Pacific  J.  Math.  £  (1958)  699-708 

"Correction  to  'Equivalence  and  Perpendicularity  of 
Gaussian  Processes'  ",  pacific  J.  Math.  8  (1958)  1295-1296 


D-91 


Feldman,  J, 


"Some  Classes  of  Equivalent  Caussian  Processes  on  an 
Interval",  Pacific  J.  Math.  10  (I960)  1211-1220 


Gaarder,  N.T. 


Goode,  B. 
1 


The  Design  of  Point  Detector  Arrays,  Ph.D.  Dissertation, 
Stanford  University,  1965 

"The  Design  of  Point  Detector  Arrays,  1",  IEEE  Trans.  Info 
Th.  IT-13  (1967)  42-5Q 

"The  Design  of  Point  Detector  Arrays,  II",  IEEE  Trans. 
Info.  Th.  IT-12  (1966)  112-120 


"Comments  on  'Detection  of  Random  Acoustic  Signals  by 
Receivers  with  Distributed  Elements:  Optimum  Receiver 
Structures  for  Normal  Signal  and  Noise  Fields'  ",  JASA 
39  (1966)  1193-1194 


Grenander,  U 

1 


Hajek,  J. 
1 


"Stochastic  Processes  and  Statistical  Inference", 
Ark.  Mat.  1  (1950)  195-277 


"in  a  Property  of  Normal  Distributions  of  any  Stochastic 
Process",  Czech.  Math.  J.  (1958)  610-618 
Translated  in  Amer.  Math.  Soc,  Translations  in  Prob.  and 
Scat.  (1961) 


llalmos,  P.R. 

1  Measure  Theory,  Van  Nostrand,  Princeton,  1950 
Hamming,  R.vi. 

1 


Numeric, 1 1  'ctii.  -is  for  Scientists  and  Engineers,  McGraw- 

Hill,  iPod 


HOrmauder,  L. 

1 


Linear  Partial  Differential  Operators,  Academic  Press, 
New  York,  190'< 


Kailath,  3'. 

1 


"Correlation  Detection  of  Signals  Perturbed  by  a  Random 
Channel",  IKE  Trans.  Info.  Th.  £  (i960)  361-366 


11-92 


f 

I 

f 


Kakutanl,  S. 


"On  Equivalence  of  Infinite  Product  Measures",  Ann.  Math. 
49  (1948)  214-224 


Kallianpur,  G,  and  Oodaira,  H. 


"The  Equivalence  and  Singularity  of  Gaussian  Measures", 
Chapter  19  in  Time  Series  Analysis,  Murray  Rosenblatt, 
ed.,  Wiley,  1963 


Kingman,  J.F.C.  and  Taylor,  S.J. 


1  Introduction  to  Measure  and  Probability.  University  Press, 
Cambridge,  1966 

Lavrentiev,  M.M. 

1  Some  Improperly  Posed  Problems  of  Mathematical  Physics, 
Springer-Verlag,  New  York,  1967 

Liusternik,  L.A.  and  Sobolev,  V.J. 

1  Elements  of  Functional  Analysis,  Frederick  Ungar,  New  York, 
1961 


Lowensteir,  C.D,  and  Anderson,  V.C. 

1  "Quick  Characterization  of  the  Directional  1 esponse  of 
Point  Array",  JASA  43  (1968)  32-36 

Martel,  II. C.  and  Mathews,  M.V. 

1  "Further  Results  on  the  Detectability  of  Known  Signals 

in  Gaussian  Noise",  Bell  Sys.  Tech.  j.  40  (1961)  423-451 

Middleton,  D.  and  Groginsky,  H.L. 

1  "Detection  of  Random  Acoustic  Signals  by  Receivers  with 
Distributed  Elements:  Optimum  Receiver  Structures  for 
Normal  Signal  and  Noise  Fields",  JASA  38  (1965)  727-737 

Morse,  P.M.  and  Ingard,  K.U. 

1  '  Theoretical  Acoustics,  McGraw-Hill,  New  York,  1968 

Officer,  C.B. 

1  .  Introduction  to  the  Theory  of  Sound  Transmission,  with 

application  to  the  ocean,  McGraw-Hill,  New  York,  1958 


D-93 


Parthasarathy ,  K.R. 


■4  fr 


^  & 

1  Probability  Measures  on  Metric  Spaces,  Academic  Press, 

New  York,  1967 

Rao,  C.R.  and  Varadarajan,  V.S. 

1  "Discrimination  of  Gaussian  Processes",  Sankhya,  Series  A, 

25  (1963)  303*330 

Root,  W.L. 

1  "Singular  Gaussian  Measures  in  Detection  Theory",  Chapter 
20  in  Time  Series  Analysis,  Murray  Rosenblatt,  ed., 

Wiley,  19G3 

Shearman,  E.D.R, 

1  "Non-qollinear  and  Cylindrical  Multiplicative  Arrays", 

J.  Brit.  IRE,  December  1963,  481-484 


Shepp,  L. A. 

1  private  communication 

2  "The  Singularity  of  Gaussian  Measures  in  Function  Space", 
Proc.  Nat.  Acad,  of  Sciences  52  (1964)  430-433 

3  "Radon-Nikodym  Derivatives  of  Gaussian  Measures",  Ann. 
Math,  Star.  37  (1966)  321-354 

Schultbeiss,  P.M. 

1  Degradation  of  Target  Detectability  Due  to  Clipping, 

Frogr  ess  Report  l! C,  General  Dynamics/Electric  Boat 
Research,  Department  of  Engineering  and  Applied  Science, 
Yale  University,  June  1963 

2  Passive  Detection  of  a  Sonar  Target  in  n  Background  of 
Ambient  Noise  and  Interference  from  a  Second  Target, 
Progress  Report  /.‘17 ,  General  Dynamics/Electric  Boat 
Research,  Department  of  Engineering  and  Applied  Science, 
Yale  University,  September  1964 

Skolnik,  H.I.,  Nemhauser,  G.,  and  Sherman  III,  J.W. 

1  "Dynamic  Programming  Applied  to  Unequally  Spaced  Arrays  , 

IEEE  Trans.  Antennas  and  Propagation,  January  1964,  35-43 


Slepian,  D. 

1  "Some  Comments  on  the  Detection  of  Gaussian  Signals  in 

Gaussian  Noise",  IRE  Trans.  Info.  Th.  b  (1958)  65-68 


D-94 


Sokolnikoff,  I.S. 

.1  Tensor  Analysis,  second  edition,  Wiley,  New  York,  1964 
Thoms t;,  T.Y. 

1  Concepts  from  Tensor  Analysis  and  Differential  Geometry, 
Academic  Press,  New  York,  1965 

'Usher,  T. 

1  Signal  Detection  by  Arrays  in  Noise  Fields  with  Local 

Variations,  Progress  Report  02,  General  Dynamic/Electric 
Boat  Research,  Department  of  Engineering  and  Applied 
Science,  Yale  University,  March  1963 


Vanderkulk,  W. 

1  "Optimum  Processing  for  Acoustic  Arrays",  J.  Brit.  IRE, 
October  1963,  285-292 

Varberg,  D.E. 

1  "On  Equivalence  of  Gaussian  Measures",  Pacific  J.  Math. 
11  (1961)  751-762 

Wilkinson,  J.H. 

1  The  Algebraic  Eigenvalue  Problem,  The  Clarendon  Press, 
Oxford,  1965 

Yaglom,  A.M. 

1  "On  Che  Equivalence  and  Perpendicularity  of  Two  Gaussian 
Probability  Measures  in  Function  Space",  Chapter  22  in 
Time  Series  Analysis,  Murray  Rosenblatt,  ed.,  Wiley, 

New  York,  1963  ’  '  ' 


D-95 


Unclassified _ 


DOCUMENT  CONTROL  DATA  -  R  &  D 

r  Security  classification  of  titlg,  body  of  abstract  and  indexing  annotation  niu«l  b*?  entered  when  the  overall  report  is  elms 


•fifed) 


1-  ORIGINATING  ACTIVITY  (Corpora!*  muthar) 

General  Dynamics  Corporation 
Electric  Boat  division 
Groton,  Connecticut _ 


3  REPORT  Tl  TL6 


Zm.  REPORT  SECURITY  CLASSIFICATION 

Unclassified 


Zb.  GROUP 


PROCESSING  OF  DATA.  FROM  SONAR  SYSTEMS,  VOLUME  VII 


*  DESCRIPTIVE  NOTES  (7ypa  of  report  and  inclusive  dates) 

July  1,  1968  to  April  JO,  1970 

!}  tUTMOR(5i  (f  irst  name.  middle  initial,  last  name) 

Franz  B,  Tuteur,  John  H.  Chang,  Verne  H.  MacDonald,  and  James  P.  Gray 


ft  REPOR  T  DATE 

_ August  11-  1970 _ 

7*.  total  no.  OF  PAGES 

361 

ft*,  c  ON  TRACT  OR  GRANT  NO 

N00014-68-C-0392 

6,  PROJEC  T  no 

»*.  ORIGINATOR'S  REPORT  NUMBER(S) 

U417-70-051 

d. 

9b.  OTHER  REPORT  ND15I  (APK  olher  number*  that  may  be  as  signed 
this  report) 

10  DISTRIBUTION  STATEMENT 

Each  transmittal  of  this  document  outside  the  agencies  of  the  U.S,  Government  must  have  j 

prior  approval  of  the  Office  of  Naval  Research  (Code  466)  | 

it  Supplementary  notes 

12-  SPONSORING  MILITABV  AfTIV 

ITY  1 

Reproduction  of  this  publication  in  whole  or  in 
part  is  permitted  for  any  purpose  of  the 

United  States  Government. 

l  i  A  95  T5(  A  C  T 

Office  of  Naval  Research 

Washington,  D.  C. 

Volume  VII  contains  four  progress  reports  (38,  39,  40,  and  41),  three  of  which  are  continua¬ 
tions  of  work  covered  in  earlier  volumes.  These  deal  with  the  effects  of  anistropy  of  the 
background  noise  field,  also  referred  to  as  interference  noise.  The  three  reports  cover  the 
form  of  the  optimum  detector  (No,  38),  the  behavior  of  adaptive  detectors  (No.  39),  and 
bearing  estimation  under  these  noise  conditions  (No.  40).  Report  No.  41,  which  deals  with 
the  effect  of  signal  models  and  the  various  possibilities  for  singular  detection,  is  a  substan¬ 
tial  departure  from  the  work  described  in  previous  progress  reports. 


( 


DD  ,F°R“.J473  ,pfl;;r  !) 


Unclassified _ 

Security  Classification 


1 «. 


KEY  W 


Antisubmarine  Warfare 
Sonar 

Optimum  Detector 
Adaptive  Detector 
Signal  Models 


DD  ,""..1473  ibaoo 

(PAGE  2) 


